| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we would first move the new objects to their appropriate
non-moving GC list, then do another pass over that list to clear their
mark bits. This is needlessly expensive. First clear the mark bits of
the existing objects, then add the newly evacuated objects and, at the
same time, clear their mark bits.
This cuts the preparatory GC time in half for the Pusher benchmark with
a large queue size.
|
| |
|
|
|
|
|
|
| |
The expectation here is that the nonmoving GC is latency-centric,
whereas the moving GC emphasizes throughput. Therefore we give the
latter the benefit of better static branch prediction.
|
|
|
|
|
|
| |
This largely follows the model used for large objects, with appropriate
adjustments made to account for references in the sharing deduplication
hashtable.
|
|\ \
| | |
| | |
| | | |
wip/gc/everything2
|
| | | |
|
| | | |
|
| | | |
|
| |/
| |
| |
| | |
This will allow us to easily move the block size elsewhere.
|
| | |
|
| | |
|
| | |
|
|/
|
|
|
| |
This allows indirection chains residing in the non-moving heap to be
shorted-out.
|
|\ \ |
|
| | |
| | |
| | |
| | | |
This is consistent with the other unoptimized ways.
|
| | |
| | |
| | |
| | | |
The nonmoving way finalizes things in a different order.
|
| | |
| | |
| | |
| | | |
The nonmoving collector doesn't support -G1
|
| | | |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The debugged RTS initializes the heap with 0xaa, which breaks the
(admittedly rather fragile) assumption that uninitialized fields are set
to 0x00:
```
Wrong exit code for heap_all(nonmoving)(expected 0 , actual 1 )
Stderr ( heap_all ):
heap_all: user error (assertClosuresEq: Closures do not match
Expected: FunClosure {info = StgInfoTable {entry = Nothing, ptrs = 0, nptrs = 1, tipe = FUN_0_1, srtlen = 0, code = Nothing}, ptrArgs = [], dataArgs = [0]}
Actual: FunClosure {info = StgInfoTable {entry = Nothing, ptrs = 0, nptrs = 1, tipe = FUN_0_1, srtlen = 1032832, code = Nothing}, ptrArgs = [], dataArgs = [12297829382473034410]}
CallStack (from HasCallStack):
assertClosuresEq, called at heap_all.hs:230:9 in main:Main
)
```
|
| | | |
|
| | | |
|
| | | |
|
| | | |
|
| | |
| | |
| | |
| | | |
The nonmoving GC doesn't support `+RTS -G1`, which this test insists on.
|
| | |
| | |
| | |
| | | |
This uses the nonmoving collector when compiling the testcases.
|
| | | |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Flush the update remembered set. The goal here is to flush periodically to
ensure that we don't end up with a thread who marks their stack on their
local update remembered set and doesn't flush until the nonmoving sync
period as this would result in a large fraction of the heap being marked
during the sync pause.
|
| | | |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Previously we would perform a preparatory moving collection, resulting
in many things being added to the mark queue. When we finished with this
we would realize in nonmovingCollect that there was already a collection
running, in which case we would simply not run the nonmoving collector.
However, it was very easy to end up in a "treadmilling" situation: all
subsequent GC following the first failed major GC would be scheduled as
major GCs. Consequently we would continuously feed the concurrent
collector with more mark queue entries and it would never finish.
This patch aborts the major collection far earlier, meaning that we
avoid adding nonmoving objects to the mark queue and allowing the
concurrent collector to finish.
|
| | | |
|
| | |
| | |
| | |
| | |
| | | |
Previously we would look at the segment header to determine the block
size despite the fact that we already had the block size at hand.
|
| | | |
|
| | |
| | |
| | |
| | | |
This improved overall runtime on nofib's constraints test by nearly 10%.
|
| | | |
|
| | | |
|
| | |
| | |
| | |
| | |
| | | |
Ensure that the bitmap of the segmentt that we will clear next is in
cache by the time we reach it.
|
| | | |
|
| | |
| | |
| | |
| | |
| | | |
Use memchr instead of a open-coded loop. This is nearly twice as fast in
a synthetic benchmark.
|
| | |
| | |
| | |
| | | |
This shortens MarkQueueEntry by 30% (one word)
|
| | | |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Perf showed that the this single div was capturing up to 10% of samples
in nonmovingMark. However, the overwhelming majority of cases is looking
at small block sizes. These cases we can easily compute explicitly,
allowing the compiler to turn the division into a significantly more
efficient division-by-constant.
While the increase in source code looks scary, this all optimises down
to very nice looking assembler. At this point the only remaining
hotspots in nonmovingBlockCount are due to memory access.
|
| | | |
|
| | | |
|
| |/
| |
| |
| |
| |
| |
| | |
This commit does two things:
* Allow aging of objects during the preparatory minor GC
* Refactor handling of static objects to avoid the use of a hashtable
|
| | |
|
| |
| |
| |
| |
| | |
Otherwise the census is unsafe when mutators are running due to
concurrent mutation.
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| | |
This introduces a simple census of the non-moving heap (not to be
confused with the heap census used by the heap profiler). This
collects basic heap usage information (number of allocated and free
blocks) which is useful when characterising fragmentation of the
nonmoving heap.
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This introduces a few events to mark key points in the nonmoving
garbage collection cycle. These include:
* `EVENT_CONC_MARK_BEGIN`, denoting the beginning of a round of
marking. This may happen more than once in a single major collection
since we the major collector iterates until it hits a fixed point.
* `EVENT_CONC_MARK_END`, denoting the end of a round of marking.
* `EVENT_CONC_SYNC_BEGIN`, denoting the beginning of the post-mark
synchronization phase
* `EVENT_CONC_UPD_REM_SET_FLUSH`, indicating that a capability has
flushed its update remembered set.
* `EVENT_CONC_SYNC_END`, denoting that all mutators have flushed their
update remembered sets.
* `EVENT_CONC_SWEEP_BEGIN`, denoting the beginning of the sweep portion
of the major collection.
* `EVENT_CONC_SWEEP_END`, denoting the end of the sweep portion of the
major collection.
|