summaryrefslogtreecommitdiff
path: root/includes/rts
Commit message (Collapse)AuthorAgeFilesLines
* Fix Work Balance computation in RTS statsDouglas Wilson2017-07-111-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | An additional stat is tracked per gc: par_balanced_copied This is the the number of bytes copied by each gc thread under the balanced lmit, which is simply (copied_bytes / num_gc_threads). The stat is added to all the appropriate GC structures, so is visible in the eventlog and in GHC.Stats. A note is added explaining how work balance is computed. Remove some end of line whitespace Test Plan: ./validate experiment with the program attached to the ticket examine code changes carefully Reviewers: simonmar, austin, hvr, bgamari, erikd Reviewed By: simonmar Subscribers: Phyx, rwbarton, thomie GHC Trac Issues: #13830 Differential Revision: https://phabricator.haskell.org/D3658
* Prefer #if defined to #ifdefBen Gamari2017-04-2811-40/+40
| | | | Our new CPP linter enforces this.
* Enable new warning for fragile/incorrect CPP #if usageErik de Castro Lopo2017-04-281-2/+2
| | | | | | | | | | | | | | | | The C code in the RTS now gets built with `-Wundef` and the Haskell code (stages 1 and 2 only) with `-Wcpp-undef`. We now get warnings whereever `#if` is used on undefined identifiers. Test Plan: Validate on Linux and Windows Reviewers: austin, angerman, simonmar, bgamari, Phyx Reviewed By: bgamari Subscribers: thomie, snowleopard Differential Revision: https://phabricator.haskell.org/D3278
* compiler/cmm/PprC.hs: constify labels in .rodataSergei Trofimovich2017-04-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Consider one-line module module B (v) where v = "hello" in -fvia-C mode it generates code like static char gibberish_str[] = "hello"; It resides in data section (precious resource on ia64!). The patch switches genrator to emit: static const char gibberish_str[] = "hello"; Other types if symbols that gained 'const' qualifier are: - info tables (from haskell and CMM) - static reference tables (from haskell and CMM) Cleanups along the way: - fixed info tables defined in .cmm to reside in .rodata - split out closure declaration into 'IC_' / 'EC_' - added label declaration (based on label type) right before each label definition (based on section type) so that C compiler could check if declaration and definition matches at definition site. Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org> Test Plan: ran testsuite on unregisterised x86_64 compiler Reviewers: simonmar, ezyang, austin, bgamari, erikd Reviewed By: bgamari, erikd Subscribers: rwbarton, thomie GHC Trac Issues: #8996 Differential Revision: https://phabricator.haskell.org/D3481
* cpp: Use #pragma once instead of #ifndef guardsBen Gamari2017-04-2342-172/+43
| | | | | | | | | | | | | | This both says what we mean and silences a bunch of spurious CPP linting warnings. This pragma is supported by all CPP implementations which we support. Reviewers: austin, erikd, simonmar, hvr Reviewed By: simonmar Subscribers: rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D3482
* Revert "Enable new warning for fragile/incorrect CPP #if usage"Ben Gamari2017-04-051-2/+2
| | | | | | | | This is causing too much platform dependent breakage at the moment. We will need a more rigorous testing strategy before this can be merged again. This reverts commit 7e340c2bbf4a56959bd1e95cdd1cfdb2b7e537c2.
* rts: Fix lingering #ifsBen Gamari2017-04-041-1/+1
| | | | These were missed in D3278.
* Enable new warning for fragile/incorrect CPP #if usageErik de Castro Lopo2017-04-051-2/+2
| | | | | | | | | | | | | | | | The C code in the RTS now gets built with `-Wundef` and the Haskell code (stages 1 and 2 only) with `-Wcpp-undef`. We now get warnings whereever `#if` is used on undefined identifiers. Test Plan: Validate on Linux and Windows Reviewers: austin, angerman, simonmar, bgamari, Phyx Reviewed By: bgamari Subscribers: thomie, snowleopard Differential Revision: https://phabricator.haskell.org/D3278
* rts: Fix buildBen Gamari2017-02-281-1/+1
| | | | | I evidently neglected to consider that validate doesn't build profiled ways. Arg.
* rts: Allow profile output path to be specified on RTS command lineBen Gamari2017-02-281-0/+1
| | | | | | | | | | | | | | | | | | | | | | This introduces a RTS option, -po, which allows the user to override the stem used to form the output file names of the heap profile and cost center summary. It's a bit unclear to me whether this is really the interface we want. Alternatively we could just allow the user to specify the `.hp` and `.prof` file names separately. This would arguably be a bit more straightforward and would allow the user to name JSON output with an appropriate `.json` suffix if they so desired. However, this would come at the cost of taking more of the option space, which is a somewhat precious commodity. Test Plan: Validate, try using `-po` RTS option Reviewers: simonmar, austin, erikd Reviewed By: simonmar Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D3182
* JSON profiler reportsBen Gamari2017-02-231-2/+2
| | | | | | | | | | | | | | | | | | | This introduces a JSON output format for cost-centre profiler reports. It's not clear whether this is really something we want to introduce given that we may also move to a more Haskell-driven output pipeline in the future, but I nevertheless found this helpful, so I thought I would put it up. Test Plan: Compile a program with `-prof -fprof-auto`; run with `+RTS -pj` Reviewers: austin, erikd, simonmar Reviewed By: simonmar Subscribers: duncan, maoe, thomie, simonmar Differential Revision: https://phabricator.haskell.org/D3132
* Fix stop_thread unwinding informationBen Gamari2017-02-081-0/+14
| | | | | | | | | | | | | | | | This corrects the unwind information for `stg_stop_thread`, which allows us to unwind back to the C stack after reaching the end of the STG stack. Test Plan: Validate Reviewers: simonmar, austin, erikd Reviewed By: simonmar Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2746
* Fix comment (old file names) in includes/Takenobu Tani2017-02-043-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | [skip ci] There ware some old file names (.lhs, ...) at comments. * includes/rts/Bytecodes.h - ghc/compiler/ghci/ByteCodeGen.lhs -> ByteCodeAsm.hs * includes/rts/Constants.h - libraries/base/GHC/Conc.lhs -> libraries/base/GHC/Conc/Sync.hs * includes/rts/storage/FunTypes.h - utils/genapply/GenApply.hs -> utils/genappl/Main.hs - compiler/codeGen/CgCallConv.lhs -> compiler/codeGen/StgCmmLayout.hs * includes/stg/MiscClosures.h - compiler/codeGen/CgStackery.lhs -> compiler/codeGen/StgCmmArgRep.hs - HeapStackCheck.hc -> HeapStackCheck.cmm Reviewers: bgamari, austin, simonmar, erikd Reviewed By: erikd Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D3074
* Add support for StaticPointers in GHCiBen Gamari2017-02-021-0/+8
| | | | | | | | | | | | | | | | | | | | | Here we add support to GHCi for StaticPointers. This process begins by adding remote GHCi messages for adding entries to the static pointer table. We then collect binders needing SPT entries after linking and send the interpreter a message adding entries with the appropriate fingerprints. Test Plan: `make test TEST=StaticPtr` Reviewers: facundominguez, mboes, simonpj, simonmar, goldfire, austin, hvr, erikd Reviewed By: simonpj, simonmar Subscribers: RyanGlScott, simonpj, thomie Differential Revision: https://phabricator.haskell.org/D2504 GHC Trac Issues: #12356
* Abstract over the way eventlogs are flushedalexbiehl2017-01-311-0/+40
| | | | | | | | | | | | | | | | | | | | Currently eventlog data is always written to a file `progname.eventlog`. This patch introduces the `flushEventLog` field in `RtsConfig` which allows to customize the writing of eventlog data. One possible scenario is the ongoing live-profile-monitor effort by @NCrashed which slurps all eventlog data through `fluchEventLog`. `flushEventLog` takes a buffer with eventlog data and its size and returns `false` (0) in case eventlog data could not be procesed. Reviewers: simonmar, austin, erikd, bgamari Reviewed By: simonmar, bgamari Subscribers: qnikst, thomie, NCrashed Differential Revision: https://phabricator.haskell.org/D2934
* Throw an exception on heap overflowDemi Obenour2017-01-101-0/+10
| | | | | | | | | | | | | | | | | This changes heap overflow to throw a HeapOverflow exception instead of killing the process. Test Plan: GHC CI Reviewers: simonmar, austin, hvr, erikd, bgamari Reviewed By: simonmar, bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2790 GHC Trac Issues: #1791
* Don't have CPP macros expanding to 'defined'.Shea Levy2016-12-131-2/+11
| | | | | | | | | | Reviewers: austin, simonmar, erikd, bgamari Reviewed By: erikd, bgamari Subscribers: angerman, thomie Differential Revision: https://phabricator.haskell.org/D2823
* Make globals use sharedCAFMoritz Angermann2016-12-111-9/+18
| | | | | | | | | | | | | | | | | | | Summary: The use of globals is quite painful when multiple rts are loaded, e.g. when plugins are loaded, which bring in a second rts. The sharedCAF appraoch was employed for the FastStringTable; I've taken the libery to extend this to the other globals I could find. This is a reboot of D2575, that should hopefully not exhibit the same windows build issues. Reviewers: Phyx, simonmar, goldfire, bgamari, austin, hvr, erikd Reviewed By: Phyx, simonmar, bgamari Subscribers: mpickering, thomie Differential Revision: https://phabricator.haskell.org/D2773
* Overhaul of Compact Regions (#12455)Simon Marlow2016-12-073-43/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This commit makes various improvements and addresses some issues with Compact Regions (aka Compact Normal Forms). This was the most important thing I wanted to fix. Compaction previously prevented GC from running until it was complete, which would be a problem in a multicore setting. Now, we compact using a hand-written Cmm routine that can be interrupted at any point. When a GC is triggered during a sharing-enabled compaction, the GC has to traverse and update the hash table, so this hash table is now stored in the StgCompactNFData object. Previously, compaction consisted of a deepseq using the NFData class, followed by a traversal in C code to copy the data. This is now done in a single pass with hand-written Cmm (see rts/Compact.cmm). We no longer use the NFData instances, instead the Cmm routine evaluates components directly as it compacts. The new compaction is about 50% faster than the old one with no sharing, and a little faster on average with sharing (the cost of the hash table dominates when we're doing sharing). Static objects that don't (transitively) refer to any CAFs don't need to be copied into the compact region. In particular this means we often avoid copying Char values and small Int values, because these are static closures in the runtime. Each Compact# object can support a single compactAdd# operation at any given time, so the Data.Compact library now enforces mutual exclusion using an MVar stored in the Compact object. We now get exceptions rather than killing everything with a barf() when we encounter an object that cannot be compacted (a function, or a mutable object). We now also detect pinned objects, which can't be compacted either. The Data.Compact API has been refactored and cleaned up. A new compactSize operation returns the size (in bytes) of the compact object. Most of the documentation is in the Haddock docs for the compact library, which I've expanded and improved here. Various comments in the code have been improved, especially the main Note [Compact Normal Forms] in rts/sm/CNF.c. I've added a few tests, and expanded a few of the tests that were there. We now also run the tests with GHCi, and in a new test way that enables sanity checking (+RTS -DS). There's a benchmark in libraries/compact/tests/compact_bench.hs for measuring compaction speed and comparing sharing vs. no sharing. The field totalDataW in StgCompactNFData was unnecessary. Test Plan: * new unit tests * validate * tested manually that we can compact Data.Aeson data Reviewers: gcampax, bgamari, ezyang, austin, niteria, hvr, erikd Subscribers: thomie, simonpj Differential Revision: https://phabricator.haskell.org/D2751 GHC Trac Issues: #12455
* Overhaul GC statsSimon Marlow2016-12-062-55/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Visible API changes: * The C struct `GCDetails` gives the stats about a single GC. This is passed to the `gcDone()` callback if one is set via the RtsConfig. (previously we just passed a collection of values, so this is more extensible, at the expense of breaking the existing API) * `RTSStats` gives cumulative stats since the start of the program, and includes the `GCDetails` for the most recent GC. This struct can be obtained via `getRTSStats()` (the old `getGCStats()` has been removed, and `getGCStatsEnabled()` has been renamed to `getRTSStatsEnabled()`) Improvements: * The per-GC stats and cumulative stats are now cleanly separated. * Inside the RTS we have a top-level `RTSStats` struct to keep all our stats in, previously this was just a collection of strangely-named variables. This struct is mostly just copied in `getRTSStats()`, so the implementation of that function is a lot shorter. * Types are more consistent. We use a uint64_t byte count for all memory values, and Time for all time values. * Names are more consistent. We use a suffix `_bytes` for all byte counts and `_ns` for all time values. * We now collect information about the amount of memory in large objects and compact objects in `GCDetails`. (the latter was the reason I started doing this patch but it seems to have ballooned a bit!) * I fixed a bug in the calculation of the elapsed MUT time, and added an ASSERT to stop the calculations going wrong in the future. For now I kept the Haskell API in `GHC.Stats` the same, by impedence-matching with the new API. We could either break that API and make it match the C API more closely, or we could add a new API and deprecate the old one. Opinions welcome. This stuff is very easy to get wrong, and it's hard to test. Reviews welcome! Test Plan: manual testing validate Reviewers: bgamari, niteria, austin, ezyang, hvr, erikd, rwbarton, Phyx Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2756
* Revert "Make globals use sharedCAF"Ben Gamari2016-11-301-18/+9
| | | | | This reverts commit 6f7ed1e51bf360621a3c2a447045ab3012f68575 due to breakage of the build on Windows.
* Use C99's boolBen Gamari2016-11-297-58/+54
| | | | | | | | | | | | Test Plan: Validate on lots of platforms Reviewers: erikd, simonmar, austin Reviewed By: erikd, simonmar Subscribers: michalt, thomie Differential Revision: https://phabricator.haskell.org/D2699
* Make globals use sharedCAFMoritz Angermann2016-11-291-9/+18
| | | | | | | | | | | | | | | The use of globals is quite painful when multiple rts are loaded, e.g. when plugins are loaded, which bring in a second rts. The sharedCAF appraoch was employed for the FastStringTable; I've taken the libery to extend this to the other globals I could find. Reviewers: rwbarton, simonmar, austin, hvr, erikd, bgamari Reviewed By: simonmar, bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2575
* Define thread primitives if they're supported.Shea Levy2016-11-291-21/+25
| | | | | | | | | | | | | | | | | | | On iOS, we use the pthread-based implementation of Itimer.c even for a non-threaded RTS. Since 999c464, this relies on synchronization primitives like Mutex, so ensure those primitives are defined whenever they are supported, even if !THREADED_RTS. Fixes #12799. Reviewers: erikd, austin, simonmar, bgamari Reviewed By: simonmar, bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2712 GHC Trac Issues: #12799
* Remove CONSTR_STATICSimon Marlow2016-11-142-76/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: We currently have two info tables for a constructor * XXX_con_info: the info table for a heap-resident instance of the constructor, It has type CONSTR, or one of the specialised types like CONSTR_1_0 * XXX_static_info: the info table for a static instance of this constructor, which has type CONSTR_STATIC or CONSTR_STATIC_NOCAF. I'm getting rid of the latter, and using the `con_info` info table for both static and dynamic constructors. For rationale and more details see Note [static constructors] in SMRep.hs. I also removed these macros: `isSTATIC()`, `ip_STATIC()`, `closure_STATIC()`, since they relied on the CONSTR/CONSTR_STATIC distinction, and anyway HEAP_ALLOCED() does the same job. Test Plan: validate Reviewers: bgamari, simonpj, austin, gcampax, hvr, niteria, erikd Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2690 GHC Trac Issues: #12455
* Add notes describing SRT conceptsBen Gamari2016-11-021-1/+12
| | | | | | | | | | | | Test Plan: Read it Reviewers: austin, erikd, simonmar Reviewed By: simonmar Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2663
* Support more than 64 logical processors on WindowsTamar Christina2016-10-011-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Windows support for more than 64 logical processors are implemented using processor groups. Essentially what it's doing is keeping the existing maximum of 64 processors and keeping the affinity mask a 64 bit value, but adds an hierarchy above that. This support was added to Windows 7 and so we need to at runtime detect if the APIs are still there due to our minimum supported version being Windows Vista. The Maximum number of groups supported at this time is 4, so 256 logical cores. The group indices are 0 based. One thread can have affinity with multiple groups. See https://msdn.microsoft.com/en-us/library/windows/desktop/ms684251.aspx and particularly helpful is the whitepaper: 'Supporting Systems that have more than 64 processors' at https://msdn.microsoft.com/en-us/library/windows/hardware/dn653313.aspx Processor groups are not guaranteed to be uniformly distributed nor guaranteed to be filled before a next group is needed. The OS will assign processors to groups based on physical proximity and will never partially assign cores from one physical cpu to more than one group. If one has two 48 core CPUs then you'd end up with two groups of 48 logical cpus. Now add a 3rd CPU with 10 cores and the group it is assigned to depends where the socket is on the board. Test Plan: ./validate or make test -c . in the rts test folder. This tests for regressions, to test this particular functionality itself: <program> +RTS -N -qa -RTS Test is detailed in description. Reviewers: bgamari, simonmar, austin, erikd Reviewed By: simonmar Subscribers: thomie, #ghc_windows_task_force Differential Revision: https://phabricator.haskell.org/D2533 GHC Trac Issues: #11054
* Revert "codeGen: Remove binutils<2.17 hack, fixes T11758"Simon Peyton Jones2016-08-191-0/+11
| | | | | | | This reverts commit e3e2e49a8f6952e1c8a19321c729c17b294d8c92. I'm reverting because it makes ghc-stage2 seg-fault on 64-bit Windows machines. Even ghc-stage2 --version seg-faults.
* codeGen: Remove binutils<2.17 hack, fixes T11758Alex Dzyoba2016-08-051-11/+0
| | | | | | | | | | | | | | | | | | | There was a complication on the x86_64 platform, where pointers were 64 bits, but the tools didn't support 64-bit relative relocations. This was true before binutils 2.17, which nowadays is quite standart (even CentOs 5 is shipped with 2.17). Hacks were removed from x86 genSwitch and asm pretty printer. Also [x86-64-relative] note was dropped from includes/rts/storage/InfoTables.h as it's not referenced anywhere now. Reviewers: austin, simonmar, rwbarton, erikd, bgamari Reviewed By: simonmar, erikd, bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2426
* Add mblocks_allocated to GC stats APIBartosz Nitka2016-07-271-0/+1
| | | | | | | | | | | | | | This exposes mblocks_allocated in the GCStats struct. Test Plan: it builds Reviewers: bgamari, simonmar, austin, hvr, erikd Reviewed By: erikd Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2429
* Compact RegionsGiovanni Campagna2016-07-205-1/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | This brings in initial support for compact regions, as described in the ICFP 2015 paper "Efficient Communication and Collection with Compact Normal Forms" (Edward Z. Yang et.al.) and implemented by Giovanni Campagna. Some things may change before the 8.2 release, but I (Simon M.) wanted to get the main patch committed so that we can iterate. What documentation there is is in the Data.Compact module in the new compact package. We'll need to extend and polish the documentation before the release. Test Plan: validate (new test cases included) Reviewers: ezyang, simonmar, hvr, bgamari, austin Subscribers: vikraman, Yuras, RyanGlScott, qnikst, mboes, facundominguez, rrnewton, thomie, erikd Differential Revision: https://phabricator.haskell.org/D1264 GHC Trac Issues: #11493
* Log heap profiler samples to event logBen Gamari2016-07-162-5/+14
| | | | | | | | | | | | Test Plan: Try it Reviewers: hvr, simonmar, austin, erikd Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1722 GHC Trac Issues: #11094
* NUMA cleanupsSimon Marlow2016-06-171-3/+1
| | | | | - Move the numaMap and nNumaNodes out of RtsFlags to Capability.c - Add a test to tests/rts
* Rts flags cleanupSimon Marlow2016-06-102-30/+17
| | | | | | | | * Remove unused/old flags from the structs * Update old comments * Add missing flags to GHC.RTS * Simplify GHC.RTS, remove C code and use hsc2hs instead * Make ParFlags unconditional, and add support to GHC.RTS
* NUMA supportSimon Marlow2016-06-107-130/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The aim here is to reduce the number of remote memory accesses on systems with a NUMA memory architecture, typically multi-socket servers. Linux provides a NUMA API for doing two things: * Allocating memory local to a particular node * Binding a thread to a particular node When given the +RTS --numa flag, the runtime will * Determine the number of NUMA nodes (N) by querying the OS * Assign capabilities to nodes, so cap C is on node C%N * Bind worker threads on a capability to the correct node * Keep a separate free lists in the block layer for each node * Allocate the nursery for a capability from node-local memory * Allocate blocks in the GC from node-local memory For example, using nofib/parallel/queens on a 24-core 2-socket machine: ``` $ ./Main 15 +RTS -N24 -s -A64m Total time 173.960s ( 7.467s elapsed) $ ./Main 15 +RTS -N24 -s -A64m --numa Total time 150.836s ( 6.423s elapsed) ``` The biggest win here is expected to be allocating from node-local memory, so that means programs using a large -A value (as here). According to perf, on this program the number of remote memory accesses were reduced by more than 50% by using `--numa`. Test Plan: * validate * There's a new flag --debug-numa=<n> that pretends to do NUMA without actually making the OS calls, which is useful for testing the code on non-NUMA systems. * TODO: I need to add some unit tests Reviewers: erikd, austin, rwbarton, ezyang, bgamari, hvr, niteria Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2199
* rts: More const correct-ness fixesErik de Castro Lopo2016-05-182-16/+42
| | | | | | | | | | | | | | | | | | | | In addition to more const-correctness fixes this patch fixes an infelicity of the previous const-correctness patch (995cf0f356) which left `UNTAG_CLOSURE` taking a `const StgClosure` pointer parameter but returning a non-const pointer. Here we restore the original type signature of `UNTAG_CLOSURE` and add a new function `UNTAG_CONST_CLOSURE` which takes and returns a const `StgClosure` pointer and uses that wherever possible. Test Plan: Validate on Linux, OS X and Windows Reviewers: Phyx, hsyl20, bgamari, austin, simonmar, trofi Reviewed By: simonmar, trofi Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2231
* rts: Make function pointer parameters `const` where possibleErik de Castro Lopo2016-05-121-8/+8
| | | | | | | | | | | | | | | | If a function takes a pointer parameter and doesn't update what the pointer points to, we can add `const` to the parameter declaration to document that no updates occur. Test Plan: Validate on Linux, OS X and Windows Reviewers: austin, Phyx, bgamari, simonmar, hsyl20 Reviewed By: bgamari, simonmar, hsyl20 Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2200
* rts: Replace `nat` with `uint32_t`Erik de Castro Lopo2016-05-0512-65/+66
| | | | | | | | | | | | The `nat` type was an alias for `unsigned int` with a comment saying it was at least 32 bits. We keep the typedef in case client code is using it but mark it as deprecated. Test Plan: Validated on Linux, OS X and Windows Reviewers: simonmar, austin, thomie, hvr, bgamari, hsyl20 Differential Revision: https://phabricator.haskell.org/D2166
* Add +RTS -AL<size>Simon Marlow2016-05-041-0/+1
| | | | | | | | | | | | | | | +RTS -AL<size> controls the total size of large objects that can be allocated before a GC is triggered. Previously this was always just the value of -A, and the limit mainly existed to prevent runaway allocation in pathalogical programs that allocate a lot of large objects. However, since the limit is shared between all cores, on a large multicore the default becomes more restrictive, and can end up triggering GC well before it would normally have been. Arguably a better default would be A*N, but this is probably excessive. Adding a flag lets you choose, and I've left the default as it was. See docs for usage.
* Allow limiting the number of GC threads (+RTS -qn<n>)Simon Marlow2016-05-041-0/+4
| | | | | | | | | | | | | | | | | | This allows the GC to use fewer threads than the number of capabilities. At each GC, we choose some of the capabilities to be "idle", which means that the thread running on that capability (if any) will sleep for the duration of the GC, and the other threads will do its work. We choose capabilities that are already idle (if any) to be the idle capabilities. The idea is that this helps in the following situation: * We want to use a large -N value so as to make use of hyperthreaded cores * We use a large heap size, so GC is infrequent * But we don't want to use all -N threads in the GC, because that thrashes the memory too much. See docs for usage.
* rts: Remove deprecated C type `lnat`Erik de Castro Lopo2016-05-021-3/+0
| | | | | | | | | | | | | | | | Summary: The `lnat` type was deprecated in 2012 in commit 41737f12f9 with a note to use `StgWord` instead. Test Plan: Validate on Linux and OS X Reviewers: simonmar, austin, Phyx, hvr, bgamari Reviewed By: simonmar, Phyx, bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2164
* RTS: delete BlockedOnGA* + dead codeThomas Miedema2016-04-294-23/+3
| | | | | | | | Some old stuff related to the PAR way. Reviewed by: austin, simonmar Differential Revision: https://phabricator.haskell.org/D2137
* Remove obsolete/redundant FLEXIBLE_ARRAY macroHerbert Valerio Riedel2016-04-183-13/+13
| | | | | | | | | | | | | | | This macro is doubly redundant, first off all, ancient GCCs prior to version 3.0 are not supported anymore, but more importantly, we require a ISO C99 compliant compiler, so we can use the proper ISO C syntax without worrying about compatibility. Reviewers: austin, bgamari Reviewed By: bgamari Subscribers: carter, thomie Differential Revision: https://phabricator.haskell.org/D2121
* Allocate blocks in the GC in batchesSimon Marlow2016-04-121-1/+2
| | | | | | | | | | | | | | | | | Avoids contention for the block allocator lock in the GC; this can be seen in the gc_alloc_block_sync counter emitted by +RTS -s. I experimented with this a while ago, and there was already commented-out code for it in GCUtils.c, but I've now improved it so that it doesn't result in significantly worse memory usage. * The old method of putting spare blocks on ws->part_list was wasteful, the spare blocks are now shared between all generations and retained between GCs. * repeated allocGroup() results in fragmentation, so I switched to using allocLargeChunk() instead which is fragmentation-friendly; we already use it for the same reason in nursery allocation.
* rts: drop unused global 'blackhole_queue'Sergei Trofimovich2016-02-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | Commit 5d52d9b64c21dcf77849866584744722f8121389 removed global 'blackhole_queue' in favour of new mechanism: when TSO hits blackhole TSO blocks waiting for 'MessgaeBlackhole' delivery. Patch removed unused global and updates stale comments. Noticed by Yuras Shumovich. Signed-off-by: Sergei Trofimovich <siarheit@google.com> Test Plan: build test Reviewers: simonmar, austin, Yuras, bgamari Reviewed By: Yuras, bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1953
* Remove unused IND_PERMJoachim Breitner2016-01-232-38/+36
| | | | | | | | | | | | | | | | | it seems that this closure type has not been in use since 5d52d9, so all this is dead and untested code. This removes it. Some of the code might be useful for a counting indirection as described in #10613, so when implementing that, have a look at what this commit removes. Test Plan: validate on harbormaster Reviewers: austin, bgamari, simonmar Reviewed By: simonmar Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1821
* Maintain cost-centre stacks in the interpreterSimon Marlow2015-12-211-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Breakpoints become SCCs, so we have detailed call-stack info for interpreted code. Currently this only works when GHC is compiled with -prof, but D1562 (Remote GHCi) removes this constraint so that in the future call stacks will be available without building your own GHCi. How can you get a stack trace? * programmatically: GHC.Stack.currentCallStack * I've added an experimental :where command that shows the stack when stopped at a breakpoint * `error` attaches a call stack automatically, although since calls to `error` are often lifted out to the top level, this is less useful than it might be (ImplicitParams still works though). * Later we might attach call stacks to all exceptions Other related changes in this diff: * I reduced the number of places that get ticks attached for breakpoints. In particular there was a breakpoint around the whole declaration, which was often redundant because it bound no variables. This reduces clutter in the stack traces and speeds up compilation. * I tidied up some RealSrcSpan stuff in InteractiveUI, and made a few other small cleanups Test Plan: validate Reviewers: ezyang, bgamari, austin, hvr Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1595 GHC Trac Issues: #11047
* Random typo fixesHerbert Valerio Riedel2015-12-171-1/+1
| | | | [skip ci]
* base: Add Haskell interface to ExecutionStackBen Gamari2015-11-231-0/+2
| | | | Differential Revision: https://phabricator.haskell.org/D1198#40948
* rts: Add LibdwPool, a pool for libdw sessionsBen Gamari2015-11-231-0/+22
| | | | Differential Revision: https://phabricator.haskell.org/D1198#40948