summaryrefslogtreecommitdiff
path: root/rts
Commit message (Collapse)AuthorAgeFilesLines
* rts/m32: Fix assertion failurewip/m32-fixesBen Gamari2022-04-281-0/+3
| | | | | | | | | | This fixes an assertion failure in the m32 allocator due to the imprecisely specified preconditions of `m32_allocator_push_filled_list`. Specifically, the caller must ensure that the page type is set to filled prior to calling `m32_allocator_push_filled_list`. While this issue did result in an assertion failure in the debug RTS, the issue is in fact benign.
* Revert "rts: Refactor handling of dead threads' stacks"Matthew Pickering2022-04-285-29/+9
| | | | This reverts commit e09afbf2a998beea7783e3de5dce5dd3c6ff23db.
* rts: add some more documentation to StgWeak closure typeAdam Sandberg Ericsson2022-04-271-2/+13
|
* Enable eventlog support in all ways by defaultBen Gamari2022-04-274-9/+10
| | | | | | | | | | | | | | | | | Here we deprecate the eventlogging RTS ways and instead enable eventlog support in the remaining ways. This simplifies packaging and reduces GHC compilation times (as we can eliminate two whole compilations of the RTS) while simplifying the end-user story. The trade-off is a small increase in binary sizes in the case that the user does not want eventlogging support, but we think that this is a fine trade-off. This also revealed a latent RTS bug: some files which included `Cmm.h` also assumed that it defined various macros which were in fact defined by `Config.h`, which `Cmm.h` did not include. Fixing this in turn revealed that `StgMiscClosures.cmm` failed to import various spinlock statistics counters, as evidenced by the failed unregisterised build. Closes #18948.
* rts/eventlog: Don't attempt to flush if there is no writerBen Gamari2022-04-271-0/+8
| | | | If the user has not configured a writer then there is nothing to flush.
* rts: state explicitly what evacuate and scavange mean in the copying gcAdam Sandberg Ericsson2022-04-272-1/+9
|
* Add note about inefficiency in returnMemoryToOSFabian Thorand2022-04-271-0/+8
|
* Defer freeing of mega block groupsFabian Thorand2022-04-273-35/+245
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Solves the quadratic worst case performance of freeing megablocks that was described in issue #19897. During GC runs, we now keep a secondary free list for megablocks that is neither sorted, nor coalesced. That way, free becomes an O(1) operation at the expense of not being able to reuse memory for larger allocations. At the end of a GC run, the secondary free list is sorted and then merged into the actual free list in a single pass. That way, our worst case performance is O(n log(n)) rather than O(n^2). We postulate that temporarily losing coalescense during a single GC run won't have any adverse effects in practice because: - We would need to release enough memory during the GC, and then after that (but within the same GC run) allocate a megablock group of more than one megablock. This seems unlikely, as large objects are not copied during GC, and so we shouldn't need such large allocations during a GC run. - Allocations of megablock groups of more than one megablock are rare. They only happen when a single heap object is large enough to require that amount of space. Any allocation areas that are supposed to hold more than one heap object cannot use megablock groups, because only the first megablock of a megablock group has valid `bdescr`s. Thus, heap object can only start in the first megablock of a group, not in later ones.
* rts: Improve documentation of closure typesBen Gamari2022-04-251-13/+35
| | | | Also drops the unused TREC_COMMITTED transaction state.
* rts: Refactor handling of dead threads' stacksBen Gamari2022-04-255-9/+29
| | | | | | | | | | | | | | | | This fixes a bug that @JunmingZhao42 and I noticed while working on her MMTK port. Specifically, in stg_stop_thread we used stg_enter_info as a sentinel at the tail of a stack after a thread has completed. However, stg_enter_info expects to have a two-field payload, which we do not push. Consequently, if the GC ends up somehow the stack it will attempt to interpret data past the end of the stack as the frame's fields, resulting in unsound behavior. To fix this I eliminate this hacky use of `stg_stop_thread` and instead introduce a new stack frame type, `stg_dead_thread_info`. Not only does this eliminate the potential for the previously mentioned memory unsoundness but it also more clearly captures the intended structure of the dead threads' stacks.
* Drop libtool path from settings fileBen Gamari2022-04-251-1/+0
| | | | | GHC no longers uses libtool for linking and therefore this is no longer necessary.
* Ensure that wired-in exception closures aren't GC'dBen Gamari2022-04-252-0/+20
| | | | | | | | | | | | | | | As described in Note [Wired-in exceptions are not CAFfy], a small set of built-in exception closures get special treatment in the code generator, being declared as non-CAFfy despite potentially containing CAF references. The original intent of this treatment for the RTS to then add StablePtrs for each of the closures, ensuring that they are not GC'd. However, this logic was not applied consistently and eventually removed entirely in 951c1fb0. This lead to #21141. Here we fix this bug by reintroducing the StablePtrs and document the status quo. Closes #21141.
* rts: Factor out built-in GC rootsBen Gamari2022-04-251-35/+41
|
* hadrian: Clean up handling of libffi dependenciesBen Gamari2022-04-251-1/+4
|
* rts: Mark closureFlags array as constBen Gamari2022-04-222-2/+2
|
* rts: Introduce ip_STACK_FRAMEBen Gamari2022-04-222-66/+68
| | | | | | While debugging it is very useful to be able to determine whether a given info table is a stack frame or not. We have spare bits in the closure flags array anyways, use one for this information.
* [ci skip] Drop outdated TODO in RtsAPI.cCheng Shao2022-04-211-4/+0
|
* rts: Ensure that the interpreter doesn't disregard tagsBen Gamari2022-04-151-4/+4
| | | | | Previously the interpreter's handling of `RET_BCO` stack frames would throw away the tag of the returned closure. This resulted in #21390.
* Only enable PROF_SPIN in DEBUGDylan Yudaken2022-04-151-0/+2
|
* rts: Fix off-by-one in snwprintf usagewip/windows-finalwip/windows-clang-joinBen Gamari2022-04-071-2/+5
|
* rts: Fallback to ucrtbase not msvcrtBen Gamari2022-04-071-3/+4
| | | | | Since we have switched to Clang the toolchain now links against ucrt rather than msvcrt.
* rts/CloneStack: Ensure that Rts.h is #included firstBen Gamari2022-04-071-2/+2
| | | | As is necessary on Windows.
*---. Merge branches 'wip/windows-high-codegen', 'wip/windows-high-linker', ↵Ben Gamari2022-04-0733-802/+1080
|\ \ \ | | | | | | | | | | | | 'wip/windows-clang-2' and 'wip/lint-rts-includes' into wip/windows-clang-join
| | | * rts: Fix various #include issuesBen Gamari2022-04-0616-30/+28
| | | | | | | | | | | | | | | | This fixes various violations of the newly-added RTS includes linter.
| | | * rts: Move __USE_MINGW_ANSI_STDIO definition to PosixSource.hBen Gamari2022-04-062-12/+12
| |_|/ |/| | | | | | | | It's easier to ensure that this is included first than Rts.h
| | * rts: Adjust RTS symbol table on Windows for ucrtBen Gamari2022-04-071-4/+4
| | |
| | * rts: Add missing newline in error messageBen Gamari2022-04-071-1/+1
| | |
| | * rts: Refactor and fix printf attributes on clangBen Gamari2022-04-073-26/+15
| | | | | | | | | | | | | | | Clang on Windows does not understand the `gnu_printf` attribute; use `printf` instead.
| | * linker/PEi386: More descriptive error messageBen Gamari2022-04-061-1/+1
| | |
| | * rts: Eliminate use of nested functionsGHC GitLab CI2022-04-061-9/+11
| |/ |/| | | | | This is a gcc-specific extension.
| * rts/linker/LoadArchive: Fix leaking file handlewip/windows-high-linkerBen Gamari2022-04-061-1/+1
| | | | | | | | | | Previously `isArchive` could leak a `FILE` handle if the `fread` returned a short read.
| * rts/linker: Split up object resolution and initializationBen Gamari2022-04-062-15/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously the RTS linker would call initializers during the "resolve" phase of linking. However, this is problematic in the case of cyclic dependencies between objects. In particular, consider the case where we have a situation where a static library contains a set of recursive objects: * object A has depends upon symbols in object B * object B has an initializer that depends upon object A * we try to load object A The linker would previously: 1. start resolving object A 2. encounter the reference to object B, loading it resolve object B 3. run object B's initializer 4. the initializer will attempt to call into object A, which hasn't been fully resolved (and therefore protected) Fix this by moving constructor execution to a new linking phase, which follows resolution. Fix #21253.
| * rts/linker: Report archive member indexBen Gamari2022-04-061-5/+7
| |
| * rts/PathUtils: Define pathprintf in terms of snwprintf on WindowsBen Gamari2022-04-061-1/+1
| | | | | | | | | | | | swprintf deviates from usual `snprintf` semantics in that it does not guarantee reasonable behavior when the buffer is NULL (that is, returning the number of bytes that would have been emitted).
| * rts/linker: More descriptive debug outputBen Gamari2022-04-062-12/+21
| |
| * rts/PEi386: Avoid accidentally-quadratic allocation costBen Gamari2022-04-061-19/+45
| | | | | | | | | | | | We now preserve the address that we last mapped, allowing us to resume our search and avoiding quadratic allocation costs. This fixes the runtime of T10296a, which allocates many adjustors.
| * rts/PEi386: Move allocateBytes to MMap.cBen Gamari2022-04-063-110/+92
| |
| * rts/PEi386: Rework linkerBen Gamari2022-04-067-377/+493
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a significant rework of the PEi386 linker, making the linker compatible with high image base addresses. Specifically, we now use the m32 allocator instead of `HeapAllocate`. In addition I found a number of latent bugs in our handling of import libraries and relocations. I've added quite a few comments describing what I've learned about Windows import libraries while fixing these. Thanks to Tamar Christina (@Phyx) for providing the address space search logic, countless hours of help while debugging, and his boundless Windows knowledge. Co-Authored-By: Tamar Christina <tamar@zhox.com>
| * rts: Mark anything that might have an info table as dataGHC GitLab CI2022-04-061-265/+269
| | | | | | | | | | | | | | Tables-next-to-code mandates that we treat symbols with info tables like data since we cannot relocate them using a jump island. See #20983.
| * rts/PEi386: Fix relocation overflow behaviorBen Gamari2022-04-063-16/+27
| | | | | | | | | | | | | | | | This fixes handling of overflowed relocations on PEi386 targets: * Refuse to create jump islands for relocations of data symbols * Correctly handle the `__imp___acrt_iob_func` symbol, which is an new type of symbol: `SYM_TYPE_INDIRECT_DATA`
| * rts/linker: Preserve information about symbol typesBen Gamari2022-04-0610-41/+128
| | | | | | | | | | | | | | | | | | | | | | | | | | As noted in #20978, the linker would previously handle overflowed relocations by creating a jump island. While this is fine in the case of code symbols, it's very much not okay in the case of data symbols. To fix this we must keep track of whether each symbol is code or data and relocate them appropriately. This patch takes the first step in this direction, adding a symbol type field to the linker's symbol table. It doesn't yet change relocation behavior to take advantage of this knowledge. Fixes #20978.
| * rts/PEi386: Fix memory leakGHC GitLab CI2022-04-061-1/+3
| | | | | | | | | | Previously we would leak the section information of the `.bss` section.
| * rts/PEi386: Move some debugging output to -DLGHC GitLab CI2022-04-061-0/+4
|/
* adjustors/i386: Use AdjustorPoolBen Gamari2022-04-065-133/+163
| | | | | | | | | | | | | | | In !7511 (closed) I introduced a new allocator for adjustors, AdjustorPool, which eliminates the address space fragmentation issues which adjustors can introduce. In that work I focused on amd64 since that was the platform where I observed issues. However, in #21132 we noted that the size of adjustors is also a cause of CI fragility on i386. In this MR I port i386 to use AdjustorPool. Sadly the complexity of the i386 adjustor code does cause require a bit of generalization which makes the code a bit more opaque but such is the world. Closes #21132.
* rts/AdjustorPool: Generalize to allow arbitrary contextsBen Gamari2022-04-064-35/+62
| | | | Unfortunately the i386 adjustor logic needs this.
* Build ar archives with -L when "joining" objectsBen Gamari2022-04-061-0/+1
| | | | Since there may be .o files which are in fact archives.
* Add a Note describing lack of object merging on WindowsBen Gamari2022-04-061-0/+2
| | | | See #21068.
* rts/linker: Catch archives masquerading as object filesBen Gamari2022-04-063-2/+33
| | | | | | | Check the file's header to catch static archive bearing the `.o` extension, as may happen on Windows after the Clang refactoring. See #21068
* Fix remaining issues in eventlog types (gen_event_types.py)Matthew Pickering2022-04-011-3/+4
| | | | | | | | | * The size of End concurrent mark phase looks wrong and, it used to be 4 and now it's 0. * The size of Task create is wrong, used to be 18 and now 14. * The event ticky-ticky entry counter begin sample has the wrong name * The event ticky-ticky entry counter being sample has the wrong size, was 0 now 32. Closes #21070
* RTS: Zero gc_cpu_start and gc_cpu_end after accountingMatthew Pickering2022-03-291-9/+11
| | | | | | | | | | | | | | | | | | | | | | | | | When passed a combination of `-N` and `-qn` options the cpu time for garbage collection was being vastly overcounted because the counters were not being zeroed appropiately. When -qn1 is passed, only 1 of the N avaiable GC threads is chosen to perform work, the rest are idle. At the end of the GC period, stat_endGC traverses all the GC threads and adds up the elapsed time from each of them. For threads which didn't participate in this GC, the value of the cpu time should be zero, but before this patch, the counters were not zeroed and hence we would count the same elapsed time on many subsequent iterations (until the thread participated in a GC again). The most direct way to zero these fields is to do so immediately after the value is added into the global counter, after which point they are never used again. We also tried another approach where we would zero the counter in yieldCapability but there are some (undiagnosed) siations where a capbility would not pass through yieldCapability before the GC ended and the same double counting problem would occur. Fixes #21082