summaryrefslogtreecommitdiff
path: root/rts
Commit message (Collapse)AuthorAgeFilesLines
...
* rts: ProfHeap: Merge some redundant ifdefsDaniel Gröber2021-02-171-10/+1
|
* rts: TraverseHeap: Allow visit_cb to be NULLDaniel Gröber2021-02-171-2/+4
|
* rts: TraverseHeap: Add a basic testDaniel Gröber2021-02-174-0/+224
| | | | | For now this just tests that the order of the callbacks is what we expect for a couple of synthetic heap graphs.
* rts: TraverseHeap: Move stackElement to headerDaniel Gröber2021-02-172-69/+64
| | | | | | | The point of this is to let user code call traversePushClosure directly instead of going through traversePushRoot. This in turn allows specifying a stackElement to be used when the traversal returns from a top-level (root) closure.
* rts: TraverseHeap: Make "flip" bit flip into it's own functionDaniel Gröber2021-02-173-11/+25
|
* rts: TraverseHeap: Move "flip" bit into traverseState structDaniel Gröber2021-02-176-57/+67
|
* rts: TraverseHeap: Make trav. data macros into functionsDaniel Gröber2021-02-174-22/+30
| | | | | This allows the global 'flip' variable not to be exported. This allows a future commit to also make it part of the traversalState struct.
* rts: TraverseHeap: Simplify profiling headerDaniel Gröber2021-02-174-13/+13
| | | | | Having a union in the closure profiling header really just complicates things so get back to basics, we just have a single StgWord there for now.
* rts: TraverseHeap: Update some commentsDaniel Gröber2021-02-171-4/+4
| | | | data_out was renamed to child_data at some point
* rts: TraverseHeap: Introduce callback for subtree completionDaniel Gröber2021-02-173-77/+185
| | | | | | | | | | | | | | | The callback 'return_cb' allows users to be perform additional accounting when the traversal of a subtree is completed. This is needed for example to determine the number or total size of closures reachable from a given closure. This commit also makes the lifetime increase of stackElements from commit "rts: TraverseHeap: Increase lifetime of stackElements" optional based on 'return_cb' being set enabled or not. Note that our definition of "subtree" here includes leaf nodes. So the invariant is that return_cb is called for all nodes in the traversal exactly once.
* rts: TraverseHeap: Link parent stackElements on the stackDaniel Gröber2021-02-171-44/+56
| | | | | | | | | The new 'sep' field links a stackElement to it's "parent". That is the stackElement containing it's parent closure. Currently not all closure types create long lived elements on the stack so this does not cover all parents along the path to the root but that is about to change in a future commit.
* rts: TraverseHeap: Increase lifetime of stackElementsDaniel Gröber2021-02-171-16/+26
| | | | | | | | | | | | | | | This modifies the lifetime of stackElements such that they stay on the stack until processing of all child closures is complete. Currently the stackElement representing a set of child closures will be removed as soon as processing of the last closure _starts_. We will use this in a future commit to allow storing information on the stack which should be accumulated in a bottom-up manner along the closure parent-child relationship. Note that the lifetime increase does not apply to 'type == posTypeFresh' stack elements. This is because they will always be pushed right back onto the stack as regular stack elements anyways.
* rts: TraverseHeap: Rename traversePushClosure to traversePushRootDaniel Gröber2021-02-173-4/+10
|
* Fix typosBrian Wignall2021-02-066-9/+9
|
* rts: Fix arguments for foreign calls of interpreterStefan Schulze Frielinghaus2021-02-051-2/+24
| | | | | | | | | | | Function arguments passed to the interpreter are extended to whole words. However, foreign function interface expects correctly typed argument pointers. Accordingly, we have to adjust argument pointers in case of a big-endian architecture. In contrast to function arguments where subwords are passed in the low bytes of a word, the return value is expected to reside in the high bytes of a word.
* rts: Use properly sized pointers in e.g. rts_mkInt8Stefan Schulze Frielinghaus2021-02-051-26/+20
| | | | | | | Since commit be5d74caab the payload of a closure of Int<N> or Word<N> is not extended anymore to the machines word size. Instead, only the first N bits of a payload are written. This patch ensures that only those bits are read/written independent of the machines endianness.
* rts: sm/GC.c: make num_idle unsignedAndreas Klebinger2021-01-281-1/+1
| | | | | We compare it to n_gc_idle_threads which is unsigned as well. So make both signed to avoid a warning.
* Deprecate -h flagMatthew Pickering2021-01-271-0/+5
| | | | | | | | | | It is confusing that it defaults to two different things depending on whether we are in the profiling way or not. Use -hc if you have a profiling build Use -hT if you have a normal build Fixes #19031
* Remove ioManager{Start,Die,Wakeup} from IOManager.hDuncan Coutts2021-01-256-15/+34
| | | | | | | | | They are not part of the IOManager interface used within the rest of the RTS. They are the part of the interface of specific I/O manager implementations. They are no longer called directly elsewhere in the RTS, and are now only called by the dispatch functions in IOManager.c
* Add a common wakeupIOManager hookDuncan Coutts2021-01-253-1/+33
| | | | | | | Use in the scheduler in threaded mode. Replaces the direct call to ioManagerWakeup which are part of specific I/O manager implementations.
* Replace a ioManagerDie call with stopIOManagerDuncan Coutts2021-01-252-1/+14
| | | | | The latter is the proper hook defined in IOManager.h. The former is part of a specific I/O manager implementation (the threaded unix one).
* Replace a direct call to ioManagerStartCap with a new hookDuncan Coutts2021-01-253-3/+48
| | | | | | | | | | Replace a direct call to ioManagerStartCap in the forkProcess in Schedule.c with a new hook initIOManagerAfterFork in IOManager. This replaces a direct hook in the scheduler from the a single I/O manager impl (the threaded unix one) with a generic hook. Add some commentrary on opportunities for future rationalisation.
* Move hooks for I/O manager startup / shutdown into IOManager.{c,h}Duncan Coutts2021-01-253-20/+88
|
* Move ioManager{Start,Wakeup,Die} to internal IOManager.hDuncan Coutts2021-01-256-2/+16
| | | | | | | | Move them from the external IOInterface.h to the internal IOManager.h. The functions are all in fact internal. They are not used from the base library at all. Remove ioManagerWakeup as an exported symbol. It is not used elsewhere.
* Move setIOManagerControlFd from Capability.c to IOManager.cDuncan Coutts2021-01-252-17/+17
| | | | | This is a better home for it. It is not really an aspect of capabilities. It is specific to one of the I/O manager impls.
* Start to centralise the I/O manager hooks from other bits of the RTSDuncan Coutts2021-01-253-0/+47
| | | | | | | | | | | | | | | | | | | | | | | | It is currently rather difficult to understand or work with the various I/O manager implementations. This is for a few reasons: 1. They do not have a clear or common API. There are some common function names, but a lot of things just get called directly. 2. They have hooks into many other parts of the RTS where they get called from. 3. There is a _lot_ of CPP involved, both THREADED_RTS vs !THREADED_RTS and also mingw32_HOST_OS vs !mingw32_HOST_OS. This doesn't really identify the I/O manager implementation. 4. They have data structures with unclear ownership, or that are co-owned with other components like the scheduler. Some data structures are used by multiple I/O managers. One thing that would help is if the interface between the I/O managers and the rest of the RTS was clearer, even if it was not completely uniform. Centralising it would make it easier to see how to reduce any unnecessary diversity in the interfaces. This patch makes a start by creating a new IOManager.{h,c} module. It is initially empty, but we will move things into it in subsequent patches.
* Rename includes/rts/IOManager.h to IOInterface.hDuncan Coutts2021-01-253-3/+3
| | | | | | | | | | | | | | | | | | | | | Naming is hard. Where we want to get to is to have a clear internal and external API for the IO manager within the RTS. What we have right now is just the external API (used in base for the Haskell side of the threaded IO manager impls) living in includes/rts/IOManager.h. We want to add a clear RTS internal API, which really ought to live in rts/IOManager.h. Several people think it's too confusing to have both: * includes/rts/IOManager.h for the external API * rts/IOManager.h for the internal API So the plan is to add rts/IOManager.{h,c} as the internal parts, and rename the external part to be includes/rts/IOInterface.h. It is admittidly not great to have .h files in includes/rts/ called "interface" since by definition, every .h fle under includes/ is an interface! Alternative naming scheme suggestions welcome!
* Move win32/IOManager to win32/MIOManagerDuncan Coutts2021-01-257-7/+7
| | | | | It is only for MIO, and we want to use the generic name IOManager for the name of the common parts of the interface and dispatch.
* Optimize some rts_mk/rts_get functions in RtsAPI.cCheng Shao2021-01-221-26/+43
| | | | | | | | | - All rts_mk functions return the tagged closure address - rts_mkChar/rts_mkInt avoid allocation when the argument is within the CHARLIKE/INTLIKE range - rts_getBool avoids a memory load by checking the closure tag - In rts_mkInt64/rts_mkWord64, allocated closure payload size is either 1 or 2 words depending on target architecture word size
* rts: Initialize card table in newArray#Ben Gamari2021-01-171-0/+3
| | | | | | | | | | Previously we would leave the card table of new arrays uninitialized. This wasn't a soundness issue: at worst we would end up doing unnecessary scavenging during GC, after which the card table would be reset. That being said, it seems worth initializing this properly to avoid both unnecessary work and non-determinism. Fixes #19143.
* rts/linker: Don't assume existence of dlinfoBen Gamari2021-01-173-12/+20
| | | | | | | | | The native-code codepath uses dlinfo to identify memory regions owned by a loaded dynamic object, facilitating safe unload. Unfortunately, this interface is not always available. Add an autoconf check for it and introduce a safe fallback behavior. Fixes #19159.
* rts: gc: use mutex+condvar instead of spinlooks in gc entry/exitDouglas Wilson2021-01-174-110/+113
| | | | | | used timed wait on condition variable in waitForGcThreads fix dodgy timespec calculation
* rts: add timedWaitConditionDouglas Wilson2021-01-172-0/+26
|
* rts: add max_n_todo_overflow internal counterDouglas Wilson2021-01-175-11/+37
| | | | | | | | I've never observed this counter taking a non-zero value, however I do think it's existence is justified by the comment in grab_local_todo_block. I've not added it to RTSStats in GHC.Stats, as it doesn't seem worth the api churn.
* rts: remove no_work counterDouglas Wilson2021-01-175-28/+6
| | | | We are no longer busyish waiting, so this is no longer meaningful
* rts: gc: use mutex+condvar instead of sched_yield in gc main loopDouglas Wilson2021-01-173-134/+237
| | | | | | | | | | | | | | | | | | | Here we remove the schedYield loop in scavenge_until_all_done+any_work, replacing it with a single mutex + condition variable. Previously any_work would check todo_large_objects, todo_q, todo_overflow of each gen for work. Comments explained that this was checking global work in any gen. However, these must have been out of date, because all of these locations are local to a gc thread. We've eliminated any_work entirely, instead simply looping back into scavenge_loop, which will quickly return if there is no work. shutdown_gc_threads is called slightly earlier than before. This ensures that n_gc_threads can never be observed to increase from 0 by a worker thread. startup_gc_threads is removed. It consisted of a single variable assignment, which is moved inline to it's single callsite.
* rts/eventlog: Reset ticky counters after dumping sampleBen Gamari2021-01-171-0/+4
|
* rts/eventlog: Introduce event to demarcate new ticky sampleBen Gamari2021-01-171-0/+7
|
* rts/PEi386: Fix reentrant lock usageBen Gamari2021-01-091-1/+1
| | | | | | | | | | | | Previously lookupSymbol_PEi386 would call lookupSymbol while holding linker_mutex. Fix this by rather calling `lookupDependentSymbol`. This is safe because lookupSymbol_PEi386 unconditionally holds linker_mutex. Happily, this un-breaks `T12771`, `T13082_good`, and `T14611`, which were previously marked as broken due to #18718. Closes #19155.
* rts/Capability: Use relaxed load in findSparkBen Gamari2021-01-091-1/+2
| | | | When checking n_returning_tasks.
* rts: Use SEQ_CST accesses when touching `wakeup`Ben Gamari2021-01-093-4/+4
| | | | | These are the two remaining non-atomic accesses to `wakeup` which were missed by the original TSAN patch.
* rts: Use relaxed load when checking for cap ownershipBen Gamari2021-01-091-1/+4
| | | | This check is merely a service to the user; no reason to synchronize.
* rts: stats: Fix calculation for fragmentationDouglas Wilson2021-01-091-1/+1
|
* rts: stats: Some fixes to stats for sequential gcsDouglas Wilson2021-01-092-14/+37
| | | | | | | | Solves #19147. When n_capabilities > 1 we were not correctly accounting for gc time for sequential collections. In this case par_n_gcthreads == 1, however it is not guaranteed that the single gc thread is capability 0. A similar issue for copied is addressed as well.
* rts/Sanity: Allow DEAD_WEAKs in weak pointer listBen Gamari2021-01-071-1/+1
| | | | | | | The weak pointer check in `checkGenWeakPtrList` previously failed to account for dead weak pointers. This caused `fptr01` to fail in the `sanity` way. Fixes #19162.
* rts/Linker: Add noreturn to loadNativeObj on non-ELF platformsBen Gamari2021-01-071-2/+6
|
* rts: Enforce that mark-region isn't used with -hBen Gamari2021-01-071-0/+10
| | | | | | | As noted in #9666, the mark-region GC is not compatible with heap profiling. Also add documentation for this flag. Closes #9666.
* rts: Zero shrunk array slop in vanilla RTSBen Gamari2021-01-071-4/+9
| | | | | | But only when profiling or DEBUG are enabled. Fixes #17572.
* Storage: Unconditionally enable zeroing of alignment slopBen Gamari2021-01-071-11/+11
| | | | This is necessary since the user may enable `+RTS -hT` at any time.
* rts: Implement heap census support for pinned objectsBen Gamari2021-01-071-29/+21
| | | | | It turns out that this was fairly straightforward to implement since we are now pretty careful about zeroing slop.