summaryrefslogtreecommitdiff
path: root/rts
Commit message (Collapse)AuthorAgeFilesLines
...
* rts/adjustor: Drop redundant commmentsBen Gamari2021-07-275-15/+0
|
* rts: Break up adjustor logicBen Gamari2021-07-2711-1302/+1509
|
* rts: Move libffi interfaces all to AdjustorBen Gamari2021-07-272-90/+51
| | | | | | | Previously the libffi Adjustor implementation would use allocateExec to create executable mappings. However, allocateExec is also used elsewhere in GHC to allocate things other than ffi_closure, which is a use-case which libffi does not support.
* rts: Document CPP guardsBen Gamari2021-07-271-10/+10
|
* RTS: try to fix timer racesSylvain Henry2021-07-262-2/+5
| | | | | | | | | | | | | | | | | | * Pthread based timer was initialized started while some other parts of the RTS assume it is initialized stopped, e.g. in hs_init_ghc: /* Start the "ticker" and profiling timer but don't start until the * scheduler is up. However, the ticker itself needs to be initialized * before the scheduler to ensure that the ticker mutex is initialized as * moreCapabilities will attempt to acquire it. */ * after a fork, don't start the timer before the IOManager is initialized: the timer handler (handle_tick) might call wakeUpRts to perform an idle GC, which calls wakeupIOManager/ioManagerWakeup Found while debugging #18033/#20132 but I couldn't confirm if it fixes them.
* [rts] Untag bq->bh prior to reading the info tableMoritz Angermann2021-07-251-1/+12
| | | | | | | | | | | | | | | | | In `checkBlockingQueues` we must always untag the `bh` field of an `StgBlockingQueue`. While at first glance it might seem a sensible assumption that `bh` will always be a blackhole and therefore never be tagged, the GC could shortcut the indirection and put a tagged pointer into the indirection. This blew up on aarch64-darwin with a misaligned access. `bh` pointed to an address that always ended in 0xa. On architectures that are a little less strict about alignment, this would have read a garbage info table pointer, which very, very unlikely would have been equal to `stg_BLACKHOLE_info` and therefore things accidentally worked. However, on AArch64, the read of the info table pointer resulted in a SIGBUS due to misaligned read. Fixes #20093.
* Support unlifted datatypes in GHCiLuite Stegeman2021-07-021-19/+34
| | | | fixes #19628
* Fix libffi on PowerPCPeter Trommler2021-06-281-13/+3
| | | | | | | | | Update submodule libffi-tarballs to upstream commit 4f9e20a. Remove C compiler flags that suppress warnings in the RTS. Those warnings have been fixed by libffi upstream. Fixes #19885
* rts: Eliminate redundant branchGHC GitLab CI2021-06-261-3/+1
| | | | | | Previously we branched unnecessarily on IF_NONMOVING_WRITE_BARRIER_ENABLED on every trip through the array barrier push loop.
* [aarch64-macho] Fix off-by-one error in the linkerMoritz Angermann2021-06-241-1/+11
| | | | | | We need to be careful about the sign bit for BR26 relocation otherwise we end up encoding a large positive number and reading back a large negative number.
* rts: move xxxHash out of the user namespaceTamar Christina2021-06-243-3/+23
|
* rts: Document --eventlog-flush-interval in RtsFlagsMatthew Pickering2021-06-221-0/+1
| | | | Fixes #19995
* rts: Pass -Wl,_U,___darwin_check_fd_set_overflow on DarwinMatthew Pickering2021-06-201-0/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Note [fd_set_overflow] ~~~~~~~~~~~~~~~~~~~~~~ In this note is the very sad tale of __darwin_fd_set_overflow. The 8.10.5 release was broken because it was built in an environment where the libraries were provided by XCode 12.*, these libraries introduced a reference to __darwin_fd_set_overflow via the FD_SET macro which is used in Select.c. Unfortunately, this symbol is not available with XCode 11.* which led to a linker error when trying to link anything. This is almost certainly a bug in XCode but we still have to work around it. Undefined symbols for architecture x86_64: "___darwin_check_fd_set_overflow", referenced from: _awaitEvent in libHSrts.a(Select.o) ld: symbol(s) not found for architecture x86_64 One way to fix this is to upgrade your version of xcode, but this would force the upgrade on users prematurely. Fortunately it also seems safe to pass the linker option "-Wl,-U,___darwin_check_fd_set_overflow" because the usage of the symbol is guarded by a guard to check if it's defined. __header_always_inline int __darwin_check_fd_set(int _a, const void *_b) { if ((uintptr_t)&__darwin_check_fd_set_overflow != (uintptr_t) 0) { return __darwin_check_fd_set_overflow(_a, _b, 1); return __darwin_check_fd_set_overflow(_a, _b, 0); } else { return 1; } Across the internet there are many other reports of this issue See: https://github.com/mono/mono/issues/19393 , https://github.com/sitsofe/fio/commit/b6a1e63a1ff607692a3caf3c2db2c3d575ba2320 The issue was originally reported in #19950 Fixes #19950
* Guard Allocate Exec via LIBFFI by LIBFFIMoritz Angermann2021-06-201-1/+1
| | | | | | | | | | | | We now have two darwin flavours. AArch64-Darwin, and x86_64-darwin, the latter one which has proper custom adjustor support, the former though relies on libffi. Mixing both leads to odd crashes, as the closures might not fit the size of the libffi closures. Hence this needs to be guarded by the USE_LBFFI_FOR_ADJUSTORS guard. Original patch by Hamish Mackenzie
* RTS: Fix flag parsing for --eventlog-flush-intervalMatthew Pickering2021-06-191-2/+2
| | | | Fixes #20006
* RTS: fix indentation warningSylvain Henry2021-06-191-12/+14
|
* Adds AArch64 Native Code GeneratorMoritz Angermann2021-06-052-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In which we add a new code generator to the Glasgow Haskell Compiler. This codegen supports ELF and Mach-O targets, thus covering Linux, macOS, and BSDs in principle. It was tested only on macOS and Linux. The NCG follows a similar structure as the other native code generators we already have, and should therfore be realtively easy to follow. It supports most of the features required for a proper native code generator, but does not claim to be perfect or fully optimised. There are still opportunities for optimisations. Metric Decrease: ManyAlternatives ManyConstructors MultiLayerModules PmSeriesG PmSeriesS PmSeriesT PmSeriesV T10421 T10421a T10858 T11195 T11276 T11303b T11374 T11822 T12227 T12545 T12707 T13035 T13253 T13253-spj T13379 T13701 T13719 T14683 T14697 T15164 T15630 T16577 T17096 T17516 T17836 T17836b T17977 T17977b T18140 T18282 T18304 T18478 T18698a T18698b T18923 T1969 T3064 T5030 T5321FD T5321Fun T5631 T5642 T5837 T783 T9198 T9233 T9630 T9872d T9961 WWRec Metric Increase: T4801
* Put Unique related global variables in the RTS (#19940)Sylvain Henry2021-06-052-0/+5
|
* Work around LLVM backend overlapping register limitationsLuite Stegeman2021-05-292-84/+18
| | | | | | | | The stg_ctoi_t and stg_ret_t procedures which convert unboxed tuples between the bytecode an native calling convention were causing a panic when using the LLVM backend. Fixes #19591
* rts: Remove trailing whitespace from Adjustor.cMatthew Pickering2021-05-111-32/+32
|
* rts: Correctly call pthread_setname_np() on NetBSDPHO2021-05-073-5/+17
| | | | | | NetBSD supports pthread_setname_np() but it expects a printf-style format string and a string argument. Also use pthread for itimer on this platform.
* rts/posix/OSThreads.c: Implement getNumberOfProcessors() for NetBSDPHO2021-05-071-6/+19
|
* rts/posix/GetTime.c: Use Solaris-specific gethrvtime(3) on OpenSolaris ↵PHO2021-05-061-0/+10
| | | | | | derivatives The constant CLOCK_THREAD_CPUTIME_ID is defined in a system header but it isn't acutally usable. clock_gettime(2) always returns EINVAL.
* Tighten scope of non-POSIX visibility macrosViktor Dukhovni2021-04-303-9/+34
| | | | | | | The __BSD_VISIBLE and _DARWIN_C_SOURCE macros expose non-POSIX prototypes in system header files. We should scope these to just the ".c" modules that actually need them, and avoid defining them in header files used in other C modules.
* rts: export allocateWrite, freeWrite and markExec #19763Adam Sandberg Ericsson2021-04-291-0/+10
|
* rts/m32: Fix bounds checkBen Gamari2021-04-261-2/+3
| | | | | | | | Previously we would check only that the *start* of the mapping was in the bottom 32-bits of address space. However, we need the *entire* mapping to be in low memory. Fix this. Noticed by @Phyx.
* Block signals in the ticker threadViktor Dukhovni2021-04-221-1/+21
| | | | | | This avoids surprises in the non-threaded runtime with blocked signals killing the process because they're only blocked in the main thread and not in the ticker thread.
* Add background note in elf_tlsgd.c.Viktor Dukhovni2021-04-225-22/+170
| | | | | | | | | | | | | | | | | | | | | | | | Also some code cleanup, and a fix for an (extant unrelated) missing <pthread_np.h> include that should hopefully resolve a failure in the FreeBSD CI build, since it is best to make sure that this MR actually builds on FreeBSD systems other than mine. Some unexpected metric changes on FreeBSD (perhaps because CI had been failing for a while???): Metric Decrease: T3064 T5321Fun T5642 T9020 T12227 T13253-spj T15164 T18282 WWRec Metric Increase: haddock.compiler
* Support R_X86_64_TLSGD relocation on FreeBSDViktor Dukhovni2021-04-225-4/+170
| | | | | | | | | | | | | | | The FreeBSD C <ctype.h> header supports per-thread locales by exporting a static inline function that references the `_ThreadRuneLocale` thread-local variable. This means that object files that use e.g. isdigit(3) end up with TLSGD(19) relocations, and would not load into ghci or the language server. Here we add support for this type of relocation, for now just on FreeBSD, and only for external references to thread-specifics defined in already loaded dynamic modules (primarily libc.so). This is sufficient to resolve the <ctype.h> issues. Runtime linking of ".o" files which *define* new thread-specific variables would be noticeably more difficult, as this would likely require new rtld APIs.
* rts: Fix usage of pthread_setname_npBen Gamari2021-04-051-2/+6
| | | | | | | Previously we used this non-portable function unconditionally, breaking FreeBSD. Fixes #19637.
* Fix copy+pasto in Sanity.cMatthew Pickering2021-04-021-1/+1
|
* [armv7] arm32 needs symbols!Moritz Angermann2021-03-292-3/+5
|
* [aarch64-darwin] be very careful of warnings.Moritz Angermann2021-03-291-0/+1
| | | | | | | So we did *not* have the stgCallocBytes prototype, and subsequently the C compiler defaulted to `int` as a return value. Thus generating sxtw instructions for the return value of stgCalloBytes to produce the expected void *.
* [rts] cast return value to struct.Moritz Angermann2021-03-291-1/+1
|
* [linker] no munmap if either agument is invalid.Moritz Angermann2021-03-291-1/+4
|
* [linker/aarch64-elf] support section symbols for GOT relocationMoritz Angermann2021-03-291-1/+7
|
* [linker] align prototype with implementation signature.Moritz Angermann2021-03-291-2/+2
|
* [linker] SymbolExtras are only used on PPC and X86Moritz Angermann2021-03-292-5/+4
|
* [linker] Additional FALLTHROUGH decorations.Moritz Angermann2021-03-291-0/+2
|
* Allocate Adjustors and mark them readable in two stepsMoritz Angermann2021-03-297-6/+46
| | | | | | | | | This drops allocateExec for darwin, and replaces it with a alloc, write, mark executable strategy instead. This prevents us from trying to allocate an executable range and then write to it, which X^W will prohibit on darwin. This will *only* work if we can use mmap.
* [macho] improved linker with proper plt supportMoritz Angermann2021-03-298-97/+311
| | | | This is a pre-requisite for making aarch64-darwin work.
* rts: Fix joinOSThread on WindowsBen Gamari2021-03-271-1/+6
| | | | | Previously we were treating the thread ID as a HANDLE, but it is not. We must first OpenThread.
* rts: Use long-path-aware statBen Gamari2021-03-231-2/+5
| | | | | | | | Previously `pathstat` relied on msvcrt's `stat` implementation, which was not long-path-aware. It should rather be defined in terms of the `stat` implementation provided by `utils/fs`. Fixes #19541.
* [elf/aarch64] Fall Through decorationMoritz Angermann2021-03-211-4/+4
|
* Add error information to osCommitMemory on failure.Moritz Angermann2021-03-201-1/+1
|
* Generate GHCi bytecode from STG instead of Core and support unboxedLuite Stegeman2021-03-205-23/+466
| | | | | | tuples and sums. fixes #1257
* Make traceHeapEventInfo an init eventMatthew Pickering2021-03-141-6/+18
| | | | This means it will be reposted everytime the eventlog is started.
* Ignore breakpoint for a specified number of iterations. (#19157)Roland Senn2021-03-101-5/+8
| | | | | | | | | | | | | | | | * Implement new debugger command `:ignore` to set an `ignore count` for a specified breakpoint. * Allow new optional parameter on `:continue` command to set an `ignore count` for the current breakpoint. * In the Interpreter replace the current `Word8` BreakArray with an `Int` array. * Change semantics of values in `BreakArray` to: n < 0 : Breakpoint is disabled. n == 0 : Breakpoint is enabled. n > 0 : Breakpoint is enabled, but ignore next `n` iterations. * Rewrite `:enable`/`:disable` processing as a special case of `:ignore`. * Remove references to `BreakArray` from `ghc/UI.hs`.
* rts: Gradually return retained memory to the OSMatthew Pickering2021-03-104-18/+104
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Related to #19381 #19359 #14702 After a spike in memory usage we have been conservative about returning allocated blocks to the OS in case we are still allocating a lot and would end up just reallocating them. The result of this was that up to 4 * live_bytes of blocks would be retained once they were allocated even if memory usage ended up a lot lower. For a heap of size ~1.5G, this would result in OS memory reporting 6G which is both misleading and worrying for users. In long-lived server applications this results in consistent high memory usage when the live data size is much more reasonable (for example ghcide) Therefore we have a new (2021) strategy which starts by retaining up to 4 * live_bytes of blocks before gradually returning uneeded memory back to the OS on subsequent major GCs which are NOT caused by a heap overflow. Each major GC which is NOT caused by heap overflow increases the consec_idle_gcs counter and the amount of memory which is retained is inversely proportional to this number. By default the excess memory retained is oldGenFactor (controlled by -F) / 2 ^ (consec_idle_gcs * returnDecayFactor) On a major GC caused by a heap overflow, the `consec_idle_gcs` variable is reset to 0 (as we could continue to allocate more, so retaining all the memory might make sense). Therefore setting bigger values for `-Fd` makes the rate at which memory is returned slower. Smaller values make it get returned faster. Setting `-Fd0` disables the memory return completely, which is the behaviour of older GHC versions. The default is `-Fd4` which results in the following scaling: > mapM print [(x, 1/ (2**(x / 4))) | x <- [1 :: Double ..20]] (1.0,0.8408964152537146) (2.0,0.7071067811865475) (3.0,0.5946035575013605) (4.0,0.5) (5.0,0.4204482076268573) (6.0,0.35355339059327373) (7.0,0.29730177875068026) (8.0,0.25) (9.0,0.21022410381342865) (10.0,0.17677669529663687) (11.0,0.14865088937534013) (12.0,0.125) (13.0,0.10511205190671433) (14.0,8.838834764831843e-2) (15.0,7.432544468767006e-2) (16.0,6.25e-2) (17.0,5.255602595335716e-2) (18.0,4.4194173824159216e-2) (19.0,3.716272234383503e-2) (20.0,3.125e-2) So after 13 consecutive GCs only 0.1 of the maximum memory used will be retained. Further to this decay factor, the amount of memory we attempt to retain is also influenced by the GC strategy for the oldest generation. If we are using a copying strategy then we will need at least 2 * live_bytes for copying to take place, so we always keep that much. If using compacting or nonmoving then we need a lower number, so we just retain at least `1.2 * live_bytes` for some protection. In future we might want to make this behaviour more aggressive, some relevant literature is > Ulan Degenbaev, Jochen Eisinger, Manfred Ernst, Ross McIlroy, and Hannes Payer. 2016. Idle time garbage collection scheduling. SIGPLAN Not. 51, 6 (June 2016), 570–583. DOI:https://doi.org/10.1145/2980983.2908106 which describes the "memory reducer" in the V8 javascript engine which on an idle collection immediately returns as much memory as possible.
* eventlog: Repost initialisation events when eventlog restartsMatthew Pickering2021-03-085-9/+93
| | | | | | | | | | | | | | | | | | If startEventlog is called after the program has already started running then quite a few useful events are missing from the eventlog because they are only posted when the program starts. This patch adds a mechanism to declare that an event should be reposted everytime the startEventlog function is called. Now in EventLog.c there is a global list of functions called `eventlog_header_funcs` which stores a list of functions which should be called everytime the eventlog starts. When calling `postInitEvent`, the event will not only be immediately posted to the eventlog but also added to the global list. When startEventLog is called, the list is traversed and the events reposted.