summaryrefslogtreecommitdiff
path: root/testsuite/tests/rts
Commit message (Collapse)AuthorAgeFilesLines
* Null eventlog writerOleg Grenrus2021-10-152-0/+11
|
* testsuite: remove 'req_smp' from testwsdequeZubin Duggal2021-10-131-1/+0
|
* testsuite: Clean up dynlib support predicatesBen Gamari2021-10-121-2/+5
| | | | | | | | | | | | | | | | | | Previously it was unclear whether req_shared_libs should require: * that the platform supports dynamic library loading, * that GHC supports dynamic linking of Haskell code, or * that the dyn way libraries were built Clarify by splitting the predicate into two: * `req_dynamic_lib_support` demands that the platform support dynamic linking * `req_dynamic_hs` demands that the GHC support dynamic linking of Haskell code on the target platform Naturally `req_dynamic_hs` cannot be true unless `req_dynamic_lib_support` is also true.
* Normalise output of T20199 testMatthew Pickering2021-10-081-1/+2
|
* Nested CPR light unleashed (#18174)Sebastian Graf2021-09-301-14/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch enables worker/wrapper for nested constructed products, as described in `Note [Nested CPR]`. The machinery for expressing Nested CPR was already there, since !5054. Worker/wrapper is equipped to exploit Nested CPR annotations since !5338. CPR analysis already handles applications in batches since !5753. This patch just needs to flip a few more switches: 1. In `cprTransformDataConWork`, we need to look at the field expressions and their `CprType`s to see whether the evaluation of the expressions terminates quickly (= is in HNF) or if they are put in strict fields. If that is the case, then we retain their CPR info and may unbox nestedly later on. More details in `Note [Nested CPR]`. 2. Enable nested `ConCPR` signatures in `GHC.Types.Cpr`. 3. In the `asConCpr` call in `GHC.Core.Opt.WorkWrap.Utils`, pass CPR info of fields to the `Unbox`. 4. Instead of giving CPR signatures to DataCon workers and wrappers, we now have `cprTransformDataConWork` for workers and treat wrappers by analysing their unfolding. As a result, the code from GHC.Types.Id.Make went away completely. 5. I deactivated worker/wrappering for recursive DataCons and wrote a function `isRecDataCon` to detect them. We really don't want to give `repeat` or `replicate` the Nested CPR property. See Note [CPR for recursive data structures] for which kind of recursive DataCons we target. 6. Fix a couple of tests and their outputs. I also documented that CPR can destroy sharing and lead to asymptotic increase in allocations (which is tracked by #13331/#19326) in `Note [CPR for data structures can destroy sharing]`. Nofib results: ``` -------------------------------------------------------------------------------- Program Allocs Instrs -------------------------------------------------------------------------------- ben-raytrace -3.1% -0.4% binary-trees +0.8% -2.9% digits-of-e2 +5.8% +1.2% event +0.8% -2.1% fannkuch-redux +0.0% -1.4% fish 0.0% -1.5% gamteb -1.4% -0.3% mkhprog +1.4% +0.8% multiplier +0.0% -1.9% pic -0.6% -0.1% reptile -20.9% -17.8% wave4main +4.8% +0.4% x2n1 -100.0% -7.6% -------------------------------------------------------------------------------- Min -95.0% -17.8% Max +5.8% +1.2% Geometric Mean -2.9% -0.4% ``` The huge wins in x2n1 (loopy list) and reptile (see #19970) are due to refraining from unboxing (:). Other benchmarks like digits-of-e2 or wave4main regress because of that. Ultimately there are no great improvements due to Nested CPR alone, but at least it's a win. Binary sizes decrease by 0.6%. There are a significant number of metric decreases. The most notable ones (>1%): ``` ManyAlternatives(normal) ghc/alloc 771656002.7 762187472.0 -1.2% ManyConstructors(normal) ghc/alloc 4191073418.7 4114369216.0 -1.8% MultiLayerModules(normal) ghc/alloc 3095678333.3 3128720704.0 +1.1% PmSeriesG(normal) ghc/alloc 50096429.3 51495664.0 +2.8% PmSeriesS(normal) ghc/alloc 63512989.3 64681600.0 +1.8% PmSeriesV(normal) ghc/alloc 62575424.0 63767208.0 +1.9% T10547(normal) ghc/alloc 29347469.3 29944240.0 +2.0% T11303b(normal) ghc/alloc 46018752.0 47367576.0 +2.9% T12150(optasm) ghc/alloc 81660890.7 82547696.0 +1.1% T12234(optasm) ghc/alloc 59451253.3 60357952.0 +1.5% T12545(normal) ghc/alloc 1705216250.7 1751278952.0 +2.7% T12707(normal) ghc/alloc 981000472.0 968489800.0 -1.3% GOOD T13056(optasm) ghc/alloc 389322664.0 372495160.0 -4.3% GOOD T13253(normal) ghc/alloc 337174229.3 341954576.0 +1.4% T13701(normal) ghc/alloc 2381455173.3 2439790328.0 +2.4% BAD T14052(ghci) ghc/alloc 2162530642.7 2139108784.0 -1.1% T14683(normal) ghc/alloc 3049744728.0 2977535064.0 -2.4% GOOD T14697(normal) ghc/alloc 362980213.3 369304512.0 +1.7% T15164(normal) ghc/alloc 1323102752.0 1307480600.0 -1.2% T15304(normal) ghc/alloc 1304607429.3 1291024568.0 -1.0% T16190(normal) ghc/alloc 281450410.7 284878048.0 +1.2% T16577(normal) ghc/alloc 7984960789.3 7811668768.0 -2.2% GOOD T17516(normal) ghc/alloc 1171051192.0 1153649664.0 -1.5% T17836(normal) ghc/alloc 1115569746.7 1098197592.0 -1.6% T17836b(normal) ghc/alloc 54322597.3 55518216.0 +2.2% T17977(normal) ghc/alloc 47071754.7 48403408.0 +2.8% T17977b(normal) ghc/alloc 42579133.3 43977392.0 +3.3% T18923(normal) ghc/alloc 71764237.3 72566240.0 +1.1% T1969(normal) ghc/alloc 784821002.7 773971776.0 -1.4% GOOD T3294(normal) ghc/alloc 1634913973.3 1614323584.0 -1.3% GOOD T4801(normal) ghc/alloc 295619648.0 292776440.0 -1.0% T5321FD(normal) ghc/alloc 278827858.7 276067280.0 -1.0% T5631(normal) ghc/alloc 586618202.7 577579960.0 -1.5% T5642(normal) ghc/alloc 494923048.0 487927208.0 -1.4% T5837(normal) ghc/alloc 37758061.3 39261608.0 +4.0% T9020(optasm) ghc/alloc 257362077.3 254672416.0 -1.0% T9198(normal) ghc/alloc 49313365.3 50603936.0 +2.6% BAD T9233(normal) ghc/alloc 704944258.7 685692712.0 -2.7% GOOD T9630(normal) ghc/alloc 1476621560.0 1455192784.0 -1.5% T9675(optasm) ghc/alloc 443183173.3 433859696.0 -2.1% GOOD T9872a(normal) ghc/alloc 1720926653.3 1693190072.0 -1.6% GOOD T9872b(normal) ghc/alloc 2185618061.3 2162277568.0 -1.1% GOOD T9872c(normal) ghc/alloc 1765842405.3 1733618088.0 -1.8% GOOD TcPlugin_RewritePerf(normal) ghc/alloc 2388882730.7 2365504696.0 -1.0% WWRec(normal) ghc/alloc 607073186.7 597512216.0 -1.6% T9203(normal) run/alloc 107284064.0 102881832.0 -4.1% haddock.Cabal(normal) run/alloc 24025329589.3 23768382560.0 -1.1% haddock.base(normal) run/alloc 25660521653.3 25370321824.0 -1.1% haddock.compiler(normal) run/alloc 74064171706.7 73358712280.0 -1.0% ``` The biggest exception to the rule is T13701 which seems to fluctuate as usual (not unlike T12545). T14697 has a similar quality, being a generated multi-module test. T5837 is small enough that it similarly doesn't measure anything significant besides module loading overhead. T13253 simply does one additional round of Simplification due to Nested CPR. There are also some apparent regressions in T9198, T12234 and PmSeriesG that we (@mpickering and I) were simply unable to reproduce locally. @mpickering tried to run the CI script in a local Docker container and actually found that T9198 and PmSeriesG *improved*. In MRs that were rebased on top this one, like !4229, I did not experience such increases. Let's not get hung up on these regression tests, they were meant to test for asymptotic regressions. The build-cabal test improves by 1.2% in -O0. Metric Increase: T10421 T12234 T12545 T13035 T13056 T13701 T14697 T18923 T5837 T9198 Metric Decrease: ManyConstructors T12545 T12707 T13056 T14683 T16577 T18223 T1969 T3294 T9203 T9233 T9675 T9872a T9872b T9872c T9961 TcPlugin_RewritePerf
* testsuite: Ensure that C++11 is used in T20199Ben Gamari2021-09-231-1/+1
| | | | Otherwise we are dependent upon the C++ compiler's default language.
* testsuite: Don't use cc directly in section_alignment testBen Gamari2021-09-231-1/+1
|
* testsuite: Make unsigned_reloc_macho_x64 and section_alignment makefile_testsBen Gamari2021-09-231-2/+2
|
* testsuite: Fix ipeMapBen Gamari2021-09-231-0/+1
| | | | ipeMap.c failed to #include <string.h>
* Use Info Table Provenances to decode cloned stack (#18163)Sven Tennie2021-09-239-31/+163
| | | | | | | | | | | | | | | | Emit an Info Table Provenance Entry (IPE) for every stack represeted info table if -finfo-table-map is turned on. To decode a cloned stack, lookupIPE() is used. It provides a mapping between info tables and their source location. Please see these notes for details: - [Stacktraces from Info Table Provenance Entries (IPE based stack unwinding)] - [Mapping Info Tables to Source Positions] Metric Increase: T12545
* Introduce stack snapshotting / cloning (#18741)Sven Tennie2021-09-236-0/+267
| | | | | | | | | | | | | | Add `StackSnapshot#` primitive type that represents a cloned stack (StgStack). The cloning interface consists of two functions, that clone either the treads own stack (cloneMyStack) or another threads stack (cloneThreadStack). The stack snapshot is offline/cold, i.e. it isn't evaluated any further. This is useful for analyses as it prevents concurrent modifications. For technical details, please see Note [Stack Cloning]. Co-authored-by: Ben Gamari <bgamari.foss@gmail.com> Co-authored-by: Matthew Pickering <matthewtpickering@gmail.com>
* Testsuite: Mark T12903 as fragile on i386Matthew Pickering2021-09-171-0/+1
| | | | Closes #20377
* Optimize Info Table Provenance Entries (IPEs) Map creation and lookupSven Tennie2021-08-118-0/+782
| | | | | | | | | | | | | | | Using a hash map reduces the complexity of lookupIPE(), making it non linear. On registration each IPE list is added to a temporary IPE lists buffer, reducing registration time. The hash map is built lazily on first lookup. IPE event output to stderr is added with tests. For details, please see Note [The Info Table Provenance Entry (IPE) Map]. A performance test for IPE registration and lookup can be found here: https://gitlab.haskell.org/ghc/ghc/-/merge_requests/5724#note_370806
* testsuite: Add test for #20199Ben Gamari2021-08-094-0/+17
| | | | Ensures that Rts.h can be parsed as C++.
* RTS: Fix flag parsing for --eventlog-flush-intervalMatthew Pickering2021-06-192-0/+8
| | | | Fixes #20006
* Adds AArch64 Native Code GeneratorMoritz Angermann2021-06-051-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In which we add a new code generator to the Glasgow Haskell Compiler. This codegen supports ELF and Mach-O targets, thus covering Linux, macOS, and BSDs in principle. It was tested only on macOS and Linux. The NCG follows a similar structure as the other native code generators we already have, and should therfore be realtively easy to follow. It supports most of the features required for a proper native code generator, but does not claim to be perfect or fully optimised. There are still opportunities for optimisations. Metric Decrease: ManyAlternatives ManyConstructors MultiLayerModules PmSeriesG PmSeriesS PmSeriesT PmSeriesV T10421 T10421a T10858 T11195 T11276 T11303b T11374 T11822 T12227 T12545 T12707 T13035 T13253 T13253-spj T13379 T13701 T13719 T14683 T14697 T15164 T15630 T16577 T17096 T17516 T17836 T17836b T17977 T17977b T18140 T18282 T18304 T18478 T18698a T18698b T18923 T1969 T3064 T5030 T5321FD T5321Fun T5631 T5642 T5837 T783 T9198 T9233 T9630 T9872d T9961 WWRec Metric Increase: T4801
* [testsuite/arm64] fix section_alignmentMoritz Angermann2021-05-071-1/+1
| | | | (cherry picked from commit f7062e1b0c91e8aa78e245a3dab9571206fce16d)
* [Aarch64] No div-by-zero; disable test.Moritz Angermann2021-05-071-0/+5
| | | | (cherry picked from commit 3592d1104c47b006fd9f4127d93916f477a6e010)
* Fix typoPeter Trommler2021-04-091-1/+1
|
* testsuite: Skip T18623 on powerpc64lePeter Trommler2021-04-091-1/+2
| | | | | | | In commit f3c23939 T18623 is disabled for aarch64. The limit seems to be too low for powerpc64le, too. This could be because tables next to code is not supported and our code generator produces larger code on PowerPC.
* Revert "[ci/arm/darwin/testsuite] Forwards ports from GHC-8.10"Ben Gamari2021-04-052-6/+1
| | | | This reverts commit 0cbdba2768d84a0f6832ae5cf9ea1e98efd739da.
* [testsuite/aarch64] disable T18623Moritz Angermann2021-03-291-1/+6
|
* Move loader state into InterpSylvain Henry2021-03-231-1/+2
| | | | | | | | | | | | | | | | | | The loader state was stored into HscEnv. As we need to have two interpreters and one loader state per interpreter in #14335, it's natural to make the loader state a field of the Interp type. As a side effect, many functions now only require a Interp parameter instead of HscEnv. Sadly we can't fully free GHC.Linker.Loader of HscEnv yet because the loader is initialised lazily from the HscEnv the first time it is used. This is left as future work. HscEnv may not contain an Interp value (i.e. hsc_interp :: Maybe Interp). So a side effect of the previous side effect is that callers of the modified functions now have to provide an Interp. It is satisfying as it pushes upstream the handling of the case where HscEnv doesn't contain an Interpreter. It is better than raising a panic (less partial functions, "parse, don't validate", etc.).
* [ci/arm/darwin/testsuite] Forwards ports from GHC-8.10Moritz Angermann2021-03-212-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a set of forward ports (cherry-picks) from 8.10 - a7d22795ed [ci] Add support for building on aarch64-darwin - 5109e87e13 [testlib/driver] denoise - 307d34945b [ci] default value for CONFIGURE_ARGS - 10a18cb4e0 [testsuite] mark ghci056 as fragile - 16c13d5acf [ci] Default value for MAKE_ARGS - ab571457b9 [ci/build] Copy config.sub around - 251892b98f [ci/darwin] bump nixpkgs rev - 5a6c36ecb4 [testsuite/darwin] fix conc059 - aae95ef0c9 [ci] add timing info - 3592d1104c [Aarch64] No div-by-zero; disable test. - 57671071ad [Darwin] mark stdc++ tests as broken - 33c4d49754 [testsuite] filter out superfluous dylib warnings - 4bea83afec [ci/nix-shell] Add Foundation and Security - 6345530062 [testsuite/json2] Fix failure with LLVM backends - c3944bc89d [ci/nix-shell] [Darwin] Stop the ld warnings about libiconv. - b821fcc714 [testsuite] static001 is not broken anymore. - f7062e1b0c [testsuite/arm64] fix section_alignment - 820b076698 [darwin] stop the DYLD_LIBRARY_PATH madness - 07b1af0362 [ci/nix-shell] uniquify NIX_LDFLAGS{_FOR_TARGET} As well as a few additional fixups needed to make this block compile: - Fixup all.T - Set CROSS_TARGET, BROKEN_TESTS, XZ, RUNTEST_ARGS, default value. - [ci] shell.nix bump happy
* [ci] Skip test's on windows that often fail in CI.wip/angerman/stable-windowsMoritz Angermann2021-03-161-1/+8
|
* rts: Gradually return retained memory to the OSMatthew Pickering2021-03-102-0/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Related to #19381 #19359 #14702 After a spike in memory usage we have been conservative about returning allocated blocks to the OS in case we are still allocating a lot and would end up just reallocating them. The result of this was that up to 4 * live_bytes of blocks would be retained once they were allocated even if memory usage ended up a lot lower. For a heap of size ~1.5G, this would result in OS memory reporting 6G which is both misleading and worrying for users. In long-lived server applications this results in consistent high memory usage when the live data size is much more reasonable (for example ghcide) Therefore we have a new (2021) strategy which starts by retaining up to 4 * live_bytes of blocks before gradually returning uneeded memory back to the OS on subsequent major GCs which are NOT caused by a heap overflow. Each major GC which is NOT caused by heap overflow increases the consec_idle_gcs counter and the amount of memory which is retained is inversely proportional to this number. By default the excess memory retained is oldGenFactor (controlled by -F) / 2 ^ (consec_idle_gcs * returnDecayFactor) On a major GC caused by a heap overflow, the `consec_idle_gcs` variable is reset to 0 (as we could continue to allocate more, so retaining all the memory might make sense). Therefore setting bigger values for `-Fd` makes the rate at which memory is returned slower. Smaller values make it get returned faster. Setting `-Fd0` disables the memory return completely, which is the behaviour of older GHC versions. The default is `-Fd4` which results in the following scaling: > mapM print [(x, 1/ (2**(x / 4))) | x <- [1 :: Double ..20]] (1.0,0.8408964152537146) (2.0,0.7071067811865475) (3.0,0.5946035575013605) (4.0,0.5) (5.0,0.4204482076268573) (6.0,0.35355339059327373) (7.0,0.29730177875068026) (8.0,0.25) (9.0,0.21022410381342865) (10.0,0.17677669529663687) (11.0,0.14865088937534013) (12.0,0.125) (13.0,0.10511205190671433) (14.0,8.838834764831843e-2) (15.0,7.432544468767006e-2) (16.0,6.25e-2) (17.0,5.255602595335716e-2) (18.0,4.4194173824159216e-2) (19.0,3.716272234383503e-2) (20.0,3.125e-2) So after 13 consecutive GCs only 0.1 of the maximum memory used will be retained. Further to this decay factor, the amount of memory we attempt to retain is also influenced by the GC strategy for the oldest generation. If we are using a copying strategy then we will need at least 2 * live_bytes for copying to take place, so we always keep that much. If using compacting or nonmoving then we need a lower number, so we just retain at least `1.2 * live_bytes` for some protection. In future we might want to make this behaviour more aggressive, some relevant literature is > Ulan Degenbaev, Jochen Eisinger, Manfred Ernst, Ross McIlroy, and Hannes Payer. 2016. Idle time garbage collection scheduling. SIGPLAN Not. 51, 6 (June 2016), 570–583. DOI:https://doi.org/10.1145/2980983.2908106 which describes the "memory reducer" in the V8 javascript engine which on an idle collection immediately returns as much memory as possible.
* rts: Use a separate free block list for allocatePinnedMatthew Pickering2021-03-082-0/+58
| | | | | | | | | | | The way in which allocatePinned took blocks out of the nursery was leading to horrible fragmentation in some workloads. The strategy now is that a separate free block list is reserved for each capability and blocks are taken from there. When it's empty the global SM lock is taken and a fresh block of size PINNED_EMPTY_SIZE is allocated. Fixes #19481
* Fix array and cleanup conversion primops (#19026)Sylvain Henry2021-03-031-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The first change makes the array ones use the proper fixed-size types, which also means that just like before, they can be used without explicit conversions with the boxed sized types. (Before, it was Int# / Word# on both sides, now it is fixed sized on both sides). For the second change, don't use "extend" or "narrow" in some of the user-facing primops names for conversions. - Names like `narrowInt32#` are misleading when `Int` is 32-bits. - Names like `extendInt64#` are flat-out wrong when `Int is 32-bits. - `narrow{Int,Word}<N>#` however map a type to itself, and so don't suffer from this problem. They are left as-is. These changes are batched together because Alex happend to use the array ops. We can only use released versions of Alex at this time, sadly, and I don't want to have to have a release thatwon't work for the final GHC 9.2. So by combining these we get all the changes for Alex done at once. Bump hackage state in a few places, and also make that workflow slightly easier for the future. Bump minimum Alex version Bump Cabal, array, bytestring, containers, text, and binary submodules
* Test start/endEventlogging: first header must be EVENT_HEADER_BEGINDavid Eichmann2021-03-024-0/+207
|
* eventlog: Fix various racesBen Gamari2021-03-021-1/+0
| | | | | | | | | | | | | | | Previously the eventlog infrastructure had a couple of races that could pop up when using the startEventLog/endEventLog interfaces. In particular, stopping and then later restarting logging could result in data preceding the eventlog header, breaking the integrity of the stream. To fix this we rework the invariants regarding the eventlog and generally tighten up the concurrency control surrounding starting and stopping of logging. We also fix an unrelated bug, wherein log events from disabled capabilities could end up never flushed.
* Remove the -xt heap profiling optionMatthew Pickering2021-02-273-10/+0
| | | | | | | It should be left to tooling to perform the filtering to remove these specific closure types from the profile if desired. Fixes #16795
* rts/PEi386: Fix reentrant lock usageBen Gamari2021-01-093-3/+0
| | | | | | | | | | | | Previously lookupSymbol_PEi386 would call lookupSymbol while holding linker_mutex. Fix this by rather calling `lookupDependentSymbol`. This is safe because lookupSymbol_PEi386 unconditionally holds linker_mutex. Happily, this un-breaks `T12771`, `T13082_good`, and `T14611`, which were previously marked as broken due to #18718. Closes #19155.
* Increase -A default to 4MB.Andreas Klebinger2020-12-221-8/+8
| | | | | | | | | | | This gives a small increase in performance under most circumstances. For single threaded GC the improvement is on the order of 1-2%. For multi threaded GC the results are quite noisy but seem to fall into the same ballpark. Fixes #16499
* testsuite: Mark divbyzero, derefnull as fragileBen Gamari2020-12-151-0/+2
| | | Due to #18548.
* Move Unit related fields from DynFlags to HscEnvSylvain Henry2020-12-141-1/+2
| | | | | | | | | | | | | The unit database cache, the home unit and the unit state were stored in DynFlags while they ought to be stored in the compiler session state (HscEnv). This patch fixes this. It introduces a new UnitEnv type that should be used in the future to handle separate unit environments (especially host vs target units). Related to #17957 Bump haddock submodule
* testsuite: Mark T14702 as fragile on WindowsBen Gamari2020-11-281-0/+1
| | | | Due to #18953.
* Add rts_listThreads and rts_listMiscRoots to RtsAPI.hDavid Eichmann2020-11-134-0/+70
| | | | | | | | These are used to find the current roots of the garbage collector. Co-authored-by: Sven Tennie's avatarSven Tennie <sven.tennie@gmail.com> Co-authored-by: Matthew Pickering's avatarMatthew Pickering <matthewtpickering@gmail.com> Co-authored-by: default avatarBen Gamari <bgamari.foss@gmail.com>
* Introduce test for dynamic library unloadingGHC GitLab CI2020-11-114-0/+117
| | | | | This uses the highMemDynamic flag introduced earlier to verify that dynamic objects are properly unloaded.
* Fix and enable object unloading in GHCiÖmer Sinan Ağacan2020-11-111-0/+3
| | | | | | | Fixes #16525 by tracking dependencies between object file symbols and marking symbol liveness during garbage collection See Note [Object unloading] in CheckUnload.c for details.
* Merge remote-tracking branch 'origin/wip/tsan/all'Ben Gamari2020-11-081-0/+4
|\
| * testsuite: Skip divbyzero and derefnull under TSANGHC GitLab CI2020-10-241-0/+4
| | | | | | | | ThreadSanitizer changes the output of these tests.
* | Linker: reorganize linker related codeSylvain Henry2020-11-031-2/+2
| | | | | | | | | | | | | | Move linker related code into GHC.Linker. Previously it was scattered into GHC.Unit.State, GHC.Driver.Pipeline, GHC.Runtime.Linker, etc. Add documentation in GHC.Linker
* | RtsAPI: pause and resume the RTSDavid Eichmann2020-11-0223-0/+674
| | | | | | | | | | | | | | | | | | The `rts_pause` and `rts_resume` functions have been added to `RtsAPI.h` and allow an external process to completely pause and resume the RTS. Co-authored-by: Sven Tennie <sven.tennie@gmail.com> Co-authored-by: Matthew Pickering <matthewtpickering@gmail.com> Co-authored-by: Ben Gamari <bgamari.foss@gmail.com>
* | Document that ccall convention doesn't support varargsBen Gamari2020-11-024-3/+32
|/ | | | | | | | | | | We do not support foreign "C" imports of varargs functions. While this works on amd64, in general the platform's calling convention may need more type information that our Cmm representation can currently provide. For instance, this is the case with Darwin's AArch64 calling convention. Document this fact in the users guide and fix T5423 which makes use of a disallowed foreign import. Closes #18854.
* Workaround for #18623: GHC crashes bc. under rlimit for vmem it will reserveBenjamin Maurer2020-09-291-0/+6
| | | | | | _all_ of it, leaving nothing for, e.g., thread stacks. Fix will only allocate 2/3rds and check whether remainder is at least large enough for minimum amount of thread stacks.
* testsuite: Mark some GHCi/Makefile tests as broken on WindowsBen Gamari2020-09-203-0/+3
| | | | See #18718.
* testsuite: Update expected output for outofmem on WindowsBen Gamari2020-09-201-1/+1
| | | | The error originates from osCommitMemory rather than getMBlocks.
* [macOS] improved runpath handlingMoritz Angermann2020-09-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In b592bd98ff25730bbe3c13d6f62a427df8c78e28 we started using -dead_strip_dylib on macOS when lining dynamic libraries and binaries. The underlying reason being the Load Command Size Limit in macOS Sierra (10.14) and later. GHC will produce @rpath/libHS... dependency entries together with a corresponding RPATH entry pointing to the location of the libHS... library. Thus for every library we produce two Load Commands. One to specify the dependent library, and one with the path where to find it. This makes relocating libraries and binaries easier, as we just need to update the RPATH entry with the install_name_tool. The dynamic linker will then subsitute each @rpath with the RPATH entries it finds in the libraries load commands or the environement, when looking up @rpath relative libraries. -dead_strip_dylibs intructs the linker to drop unused libraries. This in turn help us reduce the number of referenced libraries, and subsequently the size of the load commands. This however does not remove the RPATH entries. Subsequently we can end up (in extreme cases) with only a single @rpath/libHS... entry, but 100s or more RPATH entries in the Load Commands. This patch rectifies this (slighly unorthodox) by passing *no* -rpath arguments to the linker at link time, but -headerpad 8000. The headerpad argument is in hexadecimal and the maxium 32k of the load command size. This tells the linker to pad the load command section enough for us to inject the RPATHs later. We then proceed to link the library or binary with -dead_strip_dylibs, and *after* the linking inspect the library to find the left over (non-dead-stripped) dependencies (using otool). We find the corresponding RPATHs for each @rpath relative dependency, and inject them into the library or binary using the install_name_tool. Thus achieving a deadstripped dylib (and rpaths) build product. We can not do this in GHC, without starting to reimplement a dynamic linker as we do not know which symbols and subsequently libraries are necessary. Commissioned-by: Mercury Technologies, Inc. (mercury.com)
* Replace HscTarget with BackendSylvain Henry2020-07-221-2/+3
| | | | | | | | | They both have the same role and Backend name is more explicit. Metric Decrease: T3064 Update Haddock submodule
* rts linker: teach the linker about GLIBC's special handling of *stat, mknod ↵Adam Sandberg Ericsson2020-07-075-0/+71
| | | | and atexit functions #7072