summaryrefslogtreecommitdiff
path: root/rts/sm/NonMovingSweep.c
Commit message (Collapse)AuthorAgeFilesLines
* nonmoving: Explicitly memoize block countBen Gamari2020-04-301-3/+2
| | | | | | A profile cast doubt on whether the compiler hoisted the bound out the loop as I would have expected here. It turns out it did but nevertheless it seems clearer to just do this manually.
* nonmoving: Remove redundant bitmap clearingBen Gamari2020-03-141-2/+4
| | | | | nonmovingSweep already clears the bitmap in the sweep loop. There is no reason to do so a second time.
* rts/NonMovingSweep: Fix locking of new mutable list allocationBen Gamari2019-12-051-1/+1
| | | | | | | | | | | | | Previously we used allocBlockOnNode_sync in nonmovingSweepMutLists despite the fact that we aren't in the GC and therefore the allocation spinlock isn't in use. This meant that sweep would end up spinning until the next minor GC, when the SM lock was moved away from the SM_MUTEX to the spinlock. This isn't a correctness issue but it sure isn't good for performance. Found thanks for Ward. Fixes #17539.
* nonmoving: Clear segment bitmaps during sweepBen Gamari2019-12-051-0/+1
| | | | | | | | | | Previously we would clear the bitmaps of segments which we are going to sweep during the preparatory pause. However, this is unnecessary: the existence of the mark epoch ensures that the sweep will correctly identify non-reachable objects, even if we do not clear the bitmap. We now defer clearing the bitmap to sweep, which happens concurrently with mutation.
* rts/NonMoving: Fix various Windows build issuesBen Gamari2019-11-071-0/+2
| | | | | The Windows build seems to be stricter about not providing threading primitives in the non-threaded RTS.
* nonmoving: Upper-bound time we hold SM_MUTEX for during sweepwip/gc/opt-pauseBen Gamari2019-10-221-1/+25
|
* rts: COMPACT_NFDATA support for the nonmoving collectorÖmer Sinan Ağacan2019-10-221-0/+16
| | | | | | This largely follows the model used for large objects, with appropriate adjustments made to account for references in the sharing deduplication hashtable.
* NonMoving: Move next_free_snap to block descriptorwip/gc/segment-header-to-bdescrBen Gamari2019-10-221-2/+2
|
* NonMoving: Clean mut_listBen Gamari2019-10-221-1/+121
|
* NonMoving: Fuse sweep preparation into mark prepBen Gamari2019-10-221-32/+0
|
* Nonmoving: Allow aging and refactor static objects logicBen Gamari2019-10-221-2/+3
| | | | | | | This commit does two things: * Allow aging of objects during the preparatory minor GC * Refactor handling of static objects to avoid the use of a hashtable
* rts: Non-concurrent mark and sweepÖmer Sinan Ağacan2019-10-201-0/+273
This implements the core heap structure and a serial mark/sweep collector which can be used to manage the oldest-generation heap. This is the first step towards a concurrent mark-and-sweep collector aimed at low-latency applications. The full design of the collector implemented here is described in detail in a technical note B. Gamari. "A Concurrent Garbage Collector For the Glasgow Haskell Compiler" (2018) The basic heap structure used in this design is heavily inspired by K. Ueno & A. Ohori. "A fully concurrent garbage collector for functional programs on multicore processors." /ACM SIGPLAN Notices/ Vol. 51. No. 9 (presented by ICFP 2016) This design is intended to allow both marking and sweeping concurrent to execution of a multi-core mutator. Unlike the Ueno design, which requires no global synchronization pauses, the collector introduced here requires a stop-the-world pause at the beginning and end of the mark phase. To avoid heap fragmentation, the allocator consists of a number of fixed-size /sub-allocators/. Each of these sub-allocators allocators into its own set of /segments/, themselves allocated from the block allocator. Each segment is broken into a set of fixed-size allocation blocks (which back allocations) in addition to a bitmap (used to track the liveness of blocks) and some additional metadata (used also used to track liveness). This heap structure enables collection via mark-and-sweep, which can be performed concurrently via a snapshot-at-the-beginning scheme (although concurrent collection is not implemented in this patch). The mark queue is a fairly straightforward chunked-array structure. The representation is a bit more verbose than a typical mark queue to accomodate a combination of two features: * a mark FIFO, which improves the locality of marking, reducing one of the major overheads seen in mark/sweep allocators (see [1] for details) * the selector optimization and indirection shortcutting, which requires that we track where we found each reference to an object in case we need to update the reference at a later point (e.g. when we find that it is an indirection). See Note [Origin references in the nonmoving collector] (in `NonMovingMark.h`) for details. Beyond this the mark/sweep is fairly run-of-the-mill. [1] R. Garner, S.M. Blackburn, D. Frampton. "Effective Prefetch for Mark-Sweep Garbage Collection." ISMM 2007. Co-Authored-By: Ben Gamari <ben@well-typed.com>