summaryrefslogtreecommitdiff
path: root/rts/RtsAPI.c
Commit message (Collapse)AuthorAgeFilesLines
* rts: use allocation helpers from RtsUtilsnineonine2021-12-101-2/+2
| | | | | | | Just a tiny cleanup inspired by the following comment: https://gitlab.haskell.org/ghc/ghc/-/issues/19437#note_334271 I was just getting familiar with rts code base so I thought might as well do this.
* Make `PosixSource.h` installed and under `rts/`John Ericson2021-08-091-1/+1
| | | | | | is used outside of the rts so we do this rather than just fish it out of the repo in ad-hoc way, in order to make packages in this repo more self-contained.
* Fix typosBrian Wignall2021-02-061-2/+2
|
* rts: Use properly sized pointers in e.g. rts_mkInt8Stefan Schulze Frielinghaus2021-02-051-26/+20
| | | | | | | Since commit be5d74caab the payload of a closure of Int<N> or Word<N> is not extended anymore to the machines word size. Instead, only the first N bits of a payload are written. This patch ensures that only those bits are read/written independent of the machines endianness.
* Optimize some rts_mk/rts_get functions in RtsAPI.cCheng Shao2021-01-221-26/+43
| | | | | | | | | - All rts_mk functions return the tagged closure address - rts_mkChar/rts_mkInt avoid allocation when the argument is within the CHARLIKE/INTLIKE range - rts_getBool avoids a memory load by checking the closure tag - In rts_mkInt64/rts_mkWord64, allocated closure payload size is either 1 or 2 words depending on target architecture word size
* rts: Use relaxed load when checking for cap ownershipBen Gamari2021-01-091-1/+4
| | | | This check is merely a service to the user; no reason to synchronize.
* Add rts_listThreads and rts_listMiscRoots to RtsAPI.hDavid Eichmann2020-11-131-0/+53
| | | | | | | | These are used to find the current roots of the garbage collector. Co-authored-by: Sven Tennie's avatarSven Tennie <sven.tennie@gmail.com> Co-authored-by: Matthew Pickering's avatarMatthew Pickering <matthewtpickering@gmail.com> Co-authored-by: default avatarBen Gamari <bgamari.foss@gmail.com>
* RtsAPI: pause and resume the RTSDavid Eichmann2020-11-021-7/+177
| | | | | | | | | The `rts_pause` and `rts_resume` functions have been added to `RtsAPI.h` and allow an external process to completely pause and resume the RTS. Co-authored-by: Sven Tennie <sven.tennie@gmail.com> Co-authored-by: Matthew Pickering <matthewtpickering@gmail.com> Co-authored-by: Ben Gamari <bgamari.foss@gmail.com>
* When using rts_setInCallCapability, lock incall threadsDylan Yudaken2020-10-171-0/+20
| | | | | | | | This diff makes sure that incall threads, when using `rts_setInCallCapability`, will be created as locked. If the thread is not locked, the thread might end up being scheduled to a different capability. While this is mentioned in the docs for `rts_setInCallCapability,`, it makes the method significantly less useful as there is no guarantees on the capability being used. This commit also adds a test to make sure things stay on the correct capability.
* Fix hs_try_putmvar losing track of running capDylan Yudaken2020-02-081-0/+3
| | | | If hs_try_putmvar was called through an unsafe import, it would lose track of the running cap causing a deadlock
* Update Trac ticket URLs to point to GitLabRyan Scott2019-03-151-1/+1
| | | | | This moves all URL references to Trac tickets to their corresponding GitLab counterparts.
* Finish stable splitDavid Feuer2018-08-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Long ago, the stable name table and stable pointer tables were one. Now, they are separate, and have significantly different implementations. I believe the time has come to finish the split that began in #7674. * Divide `rts/Stable` into `rts/StableName` and `rts/StablePtr`. * Give each table its own mutex. * Add FFI functions `hs_lock_stable_ptr_table` and `hs_unlock_stable_ptr_table` and document them. These are intended to replace the previously undocumented `hs_lock_stable_tables` and `hs_lock_stable_tables`, which are now documented as deprecated synonyms. * Make `eqStableName#` use pointer equality instead of unnecessarily comparing stable name table indices. Reviewers: simonmar, bgamari, erikd Reviewed By: bgamari Subscribers: rwbarton, carter GHC Trac Issues: #15555 Differential Revision: https://phabricator.haskell.org/D5084
* Save a word in the info table on x86_64Simon Marlow2018-05-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: An info table with an SRT normally looks like this: StgWord64 srt_offset StgClosureInfo layout StgWord32 layout StgWord32 has_srt But we only need 32 bits for srt_offset on x86_64, because the small memory model requires that code segments are at most 2GB. So we can optimise this to StgClosureInfo layout StgWord32 layout StgWord32 srt_offset saving a word. We can tell whether the info table has an SRT or not, because zero is not a valid srt_offset, so zero still indicates that there's no SRT. Test Plan: * validate * For results, see D4632. Reviewers: bgamari, niteria, osa1, erikd Subscribers: thomie, carter Differential Revision: https://phabricator.haskell.org/D4634
* An overhaul of the SRT representationSimon Marlow2018-05-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - Previously we would hvae a single big table of pointers per module, with a set of bitmaps to reference entries within it. The new representation is identical to a static constructor, which is much simpler for the GC to traverse, and we get to remove the complicated bitmap-traversal code from the GC. - Rewrite all the code to generate SRTs in CmmBuildInfoTables, and document it much better (see Note [SRTs]). This has been something I've wanted to do since we moved to the new code generator, I finally had the opportunity to finish it while on a transatlantic flight recently :) There are a series of 4 diffs: 1. D4632 (this one), which does the bulk of the changes 2. D4633 which adds support for smaller `CmmLabelDiffOff` constants 3. D4634 which takes advantage of D4632 and D4633 to save a word in info tables that have an SRT on x86_64. This is where most of the binary size improvement comes from. 4. D4637 which makes a further optimisation to merge some SRTs with static FUN closures. This adds some complexity and the benefits are fairly modest, so it's not clear yet whether we should do this. Results (after (3), on x86_64) - GHC itself (staticaly linked) is 5.2% smaller - -1.7% binary sizes in nofib, -2.9% module sizes. Full nofib results: P176 - I measured the overhead of traversing all the static objects in a major GC in GHC itself by doing `replicateM_ 1000 performGC` as the first thing in `Main.main`. The new version was 5-10% faster, but the results did vary quite a bit. - I'm not sure if there's a compile-time difference, the results are too unreliable. Test Plan: validate Reviewers: bgamari, michalt, niteria, simonpj, erikd, osa1 Subscribers: thomie, carter Differential Revision: https://phabricator.haskell.org/D4632
* Prefer #if defined to #ifdefBen Gamari2017-04-281-1/+1
| | | | Our new CPP linter enforces this.
* Install toplevel handler inside fork.Alexander Vershilov2016-12-021-0/+29
| | | | | | | | | | | | | | | | | | | | When rts is forked it doesn't update toplevel handler, so UserInterrupt exception is sent to Thread1 that doesn't exist in forked process. We install toplevel handler when fork so signal will be delivered to the new main thread. Fixes #12903 Reviewers: simonmar, austin, erikd, bgamari Reviewed By: bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2770 GHC Trac Issues: #12903
* Use C99's boolBen Gamari2016-11-291-1/+1
| | | | | | | | | | | | Test Plan: Validate on lots of platforms Reviewers: erikd, simonmar, austin Reviewed By: erikd, simonmar Subscribers: michalt, thomie Differential Revision: https://phabricator.haskell.org/D2699
* Add hs_try_putmvar()Simon Marlow2016-09-121-0/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is a fast, non-blocking, asynchronous, interface to tryPutMVar that can be called from C/C++. It's useful for callback-based C/C++ APIs: the idea is that the callback invokes hs_try_putmvar(), and the Haskell code waits for the callback to run by blocking in takeMVar. The callback doesn't block - this is often a requirement of callback-based APIs. The callback wakes up the Haskell thread with minimal overhead and no unnecessary context-switches. There are a couple of benchmarks in testsuite/tests/concurrent/should_run. Some example results comparing hs_try_putmvar() with using a standard foreign export: ./hs_try_putmvar003 1 64 16 100 +RTS -s -N4 0.49s ./hs_try_putmvar003 2 64 16 100 +RTS -s -N4 2.30s hs_try_putmvar() is 4x faster for this workload (see the source for hs_try_putmvar003.hs for details of the workload). An alternative solution is to use the IO Manager for this. We've tried it, but there are problems with that approach: * Need to create a new file descriptor for each callback * The IO Manger thread(s) become a bottleneck * More potential for things to go wrong, e.g. throwing an exception in an IO Manager callback kills the IO Manager thread. Test Plan: validate; new unit tests Reviewers: niteria, erikd, ezyang, bgamari, austin, hvr Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2501
* More typos in comments [skip ci]Gabor Greif2016-06-221-1/+1
|
* rts: More const correct-ness fixesErik de Castro Lopo2016-05-181-2/+2
| | | | | | | | | | | | | | | | | | | | In addition to more const-correctness fixes this patch fixes an infelicity of the previous const-correctness patch (995cf0f356) which left `UNTAG_CLOSURE` taking a `const StgClosure` pointer parameter but returning a non-const pointer. Here we restore the original type signature of `UNTAG_CLOSURE` and add a new function `UNTAG_CONST_CLOSURE` which takes and returns a const `StgClosure` pointer and uses that wherever possible. Test Plan: Validate on Linux, OS X and Windows Reviewers: Phyx, hsyl20, bgamari, austin, simonmar, trofi Reviewed By: simonmar, trofi Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2231
* RTS: Rename InCall.stat struct field to .rstatHerbert Valerio Riedel2015-12-041-2/+2
| | | | | | | | | | | On AIX, C system headers can redirect the token `stat` via #define stat stat64 to provide large-file support. Simply avoiding the use of `stat` as an identifier eschews macro-replacement. Differential Revision: https://phabricator.haskell.org/D1566
* Fix deadlock (#10545)Simon Marlow2015-06-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | yieldCapability() was not prepared to be called by a Task that is not either a worker or a bound Task. This could happen if we ended up in yieldCapability via this call stack: performGC() scheduleDoGC() requestSync() yieldCapability() and there were a few other ways this could happen via requestSync. The fix is to handle this case in yieldCapability(): when the Task is not a worker or a bound Task, we put it on the returning_workers queue, where it will be woken up again. Summary of changes: * `yieldCapability`: factored out subroutine waitForWorkerCapability` * `waitForReturnCapability` renamed to `waitForCapability`, and factored out subroutine `waitForReturnCapability` * `releaseCapabilityAndQueue` worker renamed to `enqueueWorker`, does not take a lock and no longer tests if `!isBoundTask()` * `yieldCapability` adjusted for refactorings, only change in behavior is when it is not a worker or bound task. Test Plan: * new test concurrent/should_run/performGC * validate Reviewers: niteria, austin, ezyang, bgamari Subscribers: thomie, bgamari Differential Revision: https://phabricator.haskell.org/D997 GHC Trac Issues: #10545
* Revert "rts: add Emacs 'Local Variables' to every .c file"Simon Marlow2014-09-291-8/+0
| | | | This reverts commit 39b5c1cbd8950755de400933cecca7b8deb4ffcd.
* rts: detabify/dewhitespace RtsAPI.cAustin Seipp2014-08-201-13/+13
| | | | Signed-off-by: Austin Seipp <austin@well-typed.com>
* rts: add Emacs 'Local Variables' to every .c fileAustin Seipp2014-07-281-0/+8
| | | | | | | | This will hopefully help ensure some basic consistency in the forward by overriding buffer variables. In particular, it sets the wrap length, the offset to 4, and turns off tabs. Signed-off-by: Austin Seipp <austin@well-typed.com>
* Add hs_thread_done() (#8124)Simon Marlow2014-02-271-0/+6
| | | | See documentation for details.
* Remove superfluous #ifdef from Takano's patch.Austin Seipp2013-11-021-2/+0
| | | | Signed-off-by: Austin Seipp <austin@well-typed.com>
* rts_apply uses CCS_MAIN rather than CCS_SYSTEM (#7753)Takano Akio2013-11-021-1/+6
| | | | Signed-off-by: Austin Seipp <austin@well-typed.com>
* rts_checkSchedStatus: exit the thread, not the process, when InterruptedSimon Marlow2013-05-101-1/+10
| | | | | | | | | This means that when the process is shutting down, if we have calls to foreign exports in progress, they get forcibly terminated as before, but now they only shut down the calling thread rather than the whole process (with -threaded). This came up in a discussion started by Akio Takano on ghc-users.
* Lots of nat -> StgWord changesSimon Marlow2012-09-071-3/+3
|
* Emit the task-tracking eventsDuncan Coutts2012-07-101-0/+15
| | | | | | | | | | | | Based on initial patches by Mikolaj Konarski <mikolaj@well-typed.com> Use the new task tracing functions traceTaskCreate/Migrate/Delete. There are two key places. One is for worker tasks which have a relatively simple life cycle. Worker tasks are created and deleted by the RTS. The other case is bound tasks which are either created by the RTS, or appear as foreign C threads making calls into the RTS. For bound threads we do the tracing in rts_lock/unlock, which actually covers both threads coming in from outside, and also bound threads made by the RTS.
* Make forkProcess work with +RTS -NSimon Marlow2011-12-061-30/+34
| | | | | | | | | | | | | | | | | | | | | | Consider this experimental for the time being. There are a lot of things that could go wrong, but I've verified that at least it works on the test cases we have. I also did some API cleanups while I was here. Previously we had: Capability * rts_eval (Capability *cap, HaskellObj p, /*out*/HaskellObj *ret); but this API is particularly error-prone: if you forget to discard the Capability * you passed in and use the return value instead, then you're in for subtle bugs with +RTS -N later on. So I changed all these functions to this form: void rts_eval (/* inout */ Capability **cap, /* in */ HaskellObj p, /* out */ HaskellObj *ret) It's much harder to use this version incorrectly, because you have to pass the Capability in by reference.
* Implement stack chunks and separate TSO/STACK objectsSimon Marlow2010-12-151-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes two changes to the way stacks are managed: 1. The stack is now stored in a separate object from the TSO. This means that it is easier to replace the stack object for a thread when the stack overflows or underflows; we don't have to leave behind the old TSO as an indirection any more. Consequently, we can remove ThreadRelocated and deRefTSO(), which were a pain. This is obviously the right thing, but the last time I tried to do it it made performance worse. This time I seem to have cracked it. 2. Stacks are now represented as a chain of chunks, rather than a single monolithic object. The big advantage here is that individual chunks are marked clean or dirty according to whether they contain pointers to the young generation, and the GC can avoid traversing clean stack chunks during a young-generation collection. This means that programs with deep stacks will see a big saving in GC overhead when using the default GC settings. A secondary advantage is that there is much less copying involved as the stack grows. Programs that quickly grow a deep stack will see big improvements. In some ways the implementation is simpler, as nothing special needs to be done to reclaim stack as the stack shrinks (the GC just recovers the dead stack chunks). On the other hand, we have to manage stack underflow between chunks, so there's a new stack frame (UNDERFLOW_FRAME), and we now have separate TSO and STACK objects. The total amount of code is probably about the same as before. There are new RTS flags: -ki<size> Sets the initial thread stack size (default 1k) Egs: -ki4k -ki2m -kc<size> Sets the stack chunk size (default 32k) -kb<size> Sets the stack chunk buffer size (default 1k) -ki was previously called just -k, and the old name is still accepted for backwards compatibility. These new options are documented.
* remove unnecessary stg_noForceIO (#3508)Simon Marlow2010-09-241-1/+0
|
* Fix crash in nested callbacks (#4038)Simon Marlow2010-05-071-2/+2
| | | | | Broken by "Split part of the Task struct into a separate struct InCall".
* Make the running_finalizers flag task-localSimon Marlow2010-05-051-3/+3
| | | | | Fixes a bug reported by Lennart Augustsson, whereby we could get an incorrect error from the RTS about re-entry from a finalizer,
* Fix typo in error message (#3848)Simon Marlow2010-01-301-1/+1
|
* Make allocatePinned use local storage, and other refactoringsSimon Marlow2009-12-011-17/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | This is a batch of refactoring to remove some of the GC's global state, as we move towards CPU-local GC. - allocateLocal() now allocates large objects into the local nursery, rather than taking a global lock and allocating then in gen 0 step 0. - allocatePinned() was still allocating from global storage and taking a lock each time, now it uses local storage. (mallocForeignPtrBytes should be faster with -threaded). - We had a gen 0 step 0, distinct from the nurseries, which are stored in a separate nurseries[] array. This is slightly strange. I removed the g0s0 global that pointed to gen 0 step 0, and removed all uses of it. I think now we don't use gen 0 step 0 at all, except possibly when there is only one generation. Possibly more tidying up is needed here. - I removed the global allocate() function, and renamed allocateLocal() to allocate(). - the alloc_blocks global is gone. MAYBE_GC() and doYouWantToGC() now check the local nursery only.
* Detect C finalizer callbacks in rts_lock() instead of schedule()Simon Marlow2009-08-191-0/+9
| | | | | Otherwise, finalizer callbacks cause a deadlock in the threaded RTS (including GHCi)
* RTS tidyup sweep, first phaseSimon Marlow2009-08-021-5/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The first phase of this tidyup is focussed on the header files, and in particular making sure we are exposinng publicly exactly what we need to, and no more. - Rts.h now includes everything that the RTS exposes publicly, rather than a random subset of it. - Most of the public header files have moved into subdirectories, and many of them have been renamed. But clients should not need to include any of the other headers directly, just #include the main public headers: Rts.h, HsFFI.h, RtsAPI.h. - All the headers needed for via-C compilation have moved into the stg subdirectory, which is self-contained. Most of the headers for the rest of the RTS APIs have moved into the rts subdirectory. - I left MachDeps.h where it is, because it is so widely used in Haskell code. - I left a deprecated stub for RtsFlags.h in place. The flag structures are now exposed by Rts.h. - Various internal APIs are no longer exposed by public header files. - Various bits of dead code and declarations have been removed - More gcc warnings are turned on, and the RTS code is more warning-clean. - More source files #include "PosixSource.h", and hence only use standard POSIX (1003.1c-1995) interfaces. There is a lot more tidying up still to do, this is just the first pass. I also intend to standardise the names for external RTS APIs (e.g use the rts_ prefix consistently), and declare the internal APIs as hidden for shared libraries.
* replace sparc-specific Int64 code with calls to platform-independent macrosSimon Marlow2009-07-271-116/+4
|
* Remove old GUM/GranSim codeSimon Marlow2009-06-021-12/+0
|
* Fix #3236: emit a helpful error message when the RTS has not been initialisedSimon Marlow2009-05-181-6/+0
|
* SPARC: Fix ffi019 split load/store of HsInt64 into two parts to respect ↵Ben.Lippmeier@anu.edu.au2009-03-311-0/+51
| | | | alignment constraints
* SPARC NCG: When getting a 64 bit word, promote halves to 64 bit before shiftingBen.Lippmeier@anu.edu.au2009-03-301-1/+1
|
* SPARC NCG: Also do misaligned reads (this time for sure!)Ben.Lippmeier@anu.edu.au2009-01-221-8/+8
|
* SPARC NCG: Also do misaligned readsBen.Lippmeier@anu.edu.au2009-01-211-0/+25
|
* SPARC NCG: Add a SPARC version of rts_mkInt64 that handles misaligned ↵Ben.Lippmeier@anu.edu.au2009-01-211-0/+28
| | | | closure payloads.
* Fix some more shutdown racesSimon Marlow2008-11-191-8/+10
| | | | | | | There were races between workerTaskStop() and freeTaskManager(): we need to be sure that all Tasks have exited properly before we start tearing things down. This isn't completely straighforward, see comments for details.
* fix #2594: we were erroneously applying masks, as the reporter suggestedSimon Marlow2008-09-301-3/+3
| | | | | | My guess is that this is left over from when we represented Int8 and friends as zero-extended rather than sign-extended. It's amazing it hasn't been noticed earlier.