summaryrefslogtreecommitdiff
path: root/includes/mkDerivedConstants.c
Commit message (Collapse)AuthorAgeFilesLines
...
* Fix #650: use a card table to mark dirty sections of mutable arraysSimon Marlow2009-12-171-0/+1
| | | | | | | | | | | | The card table is an array of bytes, placed directly following the actual array data. This means that array reading is unaffected, but array writing needs to read the array size from the header in order to find the card table. We use a bytemap rather than a bitmap, because updating the card table must be multi-thread safe. Each byte refers to 128 entries of the array, but this is tunable by changing the constant MUT_ARR_PTRS_CARD_BITS in includes/Constants.h.
* Correction to the allocation stats following earlier refactoringSimon Marlow2009-12-041-1/+1
|
* GC refactoring, remove "steps"Simon Marlow2009-12-031-2/+1
| | | | | | | | | | | | | | | | | | | | | The GC had a two-level structure, G generations each of T steps. Steps are for aging within a generation, mostly to avoid premature promotion. Measurements show that more than 2 steps is almost never worthwhile, and 1 step is usually worse than 2. In theory fractional steps are possible, so the ideal number of steps is somewhere between 1 and 3. GHC's default has always been 2. We can implement 2 steps quite straightforwardly by having each block point to the generation to which objects in that block should be promoted, so blocks in the nursery point to generation 0, and blocks in gen 0 point to gen 1, and so on. This commit removes the explicit step structures, merging generations with steps, thus simplifying a lot of code. Performance is unaffected. The tunable number of steps is now gone, although it may be replaced in the future by a way to tune the aging in generation 0.
* Make allocatePinned use local storage, and other refactoringsSimon Marlow2009-12-011-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | This is a batch of refactoring to remove some of the GC's global state, as we move towards CPU-local GC. - allocateLocal() now allocates large objects into the local nursery, rather than taking a global lock and allocating then in gen 0 step 0. - allocatePinned() was still allocating from global storage and taking a lock each time, now it uses local storage. (mallocForeignPtrBytes should be faster with -threaded). - We had a gen 0 step 0, distinct from the nurseries, which are stored in a separate nurseries[] array. This is slightly strange. I removed the g0s0 global that pointed to gen 0 step 0, and removed all uses of it. I think now we don't use gen 0 step 0 at all, except possibly when there is only one generation. Possibly more tidying up is needed here. - I removed the global allocate() function, and renamed allocateLocal() to allocate(). - the alloc_blocks global is gone. MAYBE_GC() and doYouWantToGC() now check the local nursery only.
* micro-opt: replace stmGetEnclosingTRec() with a field accessSimon Marlow2009-10-141-0/+2
| | | | | While fixing #3578 I noticed that this function was just a field access to StgTRecHeader, so I inlined it manually.
* Fix #3429: a tricky race conditionSimon Marlow2009-08-181-0/+1
| | | | | | | | | | | | | | | | | | There were two bugs, and had it not been for the first one we would not have noticed the second one, so this is quite fortunate. The first bug is in stg_unblockAsyncExceptionszh_ret, when we found a pending exception to raise, but don't end up raising it, there was a missing adjustment to the stack pointer. The second bug was that this case was actually happening at all: it ought to be incredibly rare, because the pending exception thread would have to be killed between us finding it and attempting to raise the exception. This made me suspicious. It turned out that there was a race condition on the tso->flags field; multiple threads were updating this bitmask field non-atomically (one of the bits is the dirty-bit for the generational GC). The fix is to move the dirty bit into its own field of the TSO, making the TSO one word larger (sadly).
* RTS tidyup sweep, first phaseSimon Marlow2009-08-021-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The first phase of this tidyup is focussed on the header files, and in particular making sure we are exposinng publicly exactly what we need to, and no more. - Rts.h now includes everything that the RTS exposes publicly, rather than a random subset of it. - Most of the public header files have moved into subdirectories, and many of them have been renamed. But clients should not need to include any of the other headers directly, just #include the main public headers: Rts.h, HsFFI.h, RtsAPI.h. - All the headers needed for via-C compilation have moved into the stg subdirectory, which is self-contained. Most of the headers for the rest of the RTS APIs have moved into the rts subdirectory. - I left MachDeps.h where it is, because it is so widely used in Haskell code. - I left a deprecated stub for RtsFlags.h in place. The flag structures are now exposed by Rts.h. - Various internal APIs are no longer exposed by public header files. - Various bits of dead code and declarations have been removed - More gcc warnings are turned on, and the RTS code is more warning-clean. - More source files #include "PosixSource.h", and hence only use standard POSIX (1003.1c-1995) interfaces. There is a lot more tidying up still to do, this is just the first pass. I also intend to standardise the names for external RTS APIs (e.g use the rts_ prefix consistently), and declare the internal APIs as hidden for shared libraries.
* propagate the result of atomically properly (fixes #3049)Simon Marlow2009-06-241-0/+1
|
* Remove the implementation of gmp primops from the rtsDuncan Coutts2009-06-131-6/+0
|
* Remove the various mp registers from the StgRegTableDuncan Coutts2009-06-101-7/+0
| | | | No longer need them as temp vars in the cmm primop implementations.
* Remove old GUM/GranSim codeSimon Marlow2009-06-021-8/+1
|
* Use the more portable %lu rather than %zuIan Lynagh2009-05-241-8/+8
| | | | | We now also need to cast the values to (unsigned long), as on some platforms sizeof returns (unsigned int).
* Fix warnings in mkDerivedConstantsIan Lynagh2009-05-231-10/+10
|
* FIX #1364: added support for C finalizers that run as soon as the value is ↵Simon Marlow2008-12-101-0/+1
| | | | | | | | | | | not longer reachable. Patch originally by Ivan Tomac <tomac@pacific.net.au>, amended by Simon Marlow: - mkWeakFinalizer# commoned up with mkWeakFinalizerEnv# - GC parameters to ALLOC_PRIM fixed
* Merging in the new codegen branchdias@eecs.harvard.edu2008-08-141-10/+25
| | | | | | | | | | | | | | | | | | This merge does not turn on the new codegen (which only compiles a select few programs at this point), but it does introduce some changes to the old code generator. The high bits: 1. The Rep Swamp patch is finally here. The highlight is that the representation of types at the machine level has changed. Consequently, this patch contains updates across several back ends. 2. The new Stg -> Cmm path is here, although it appears to have a fair number of bugs lurking. 3. Many improvements along the CmmCPSZ path, including: o stack layout o some code for infotables, half of which is right and half wrong o proc-point splitting
* Add optional eager black-holing, with new flag -feager-blackholingSimon Marlow2008-11-181-0/+1
| | | | | | | | | | | | | | | Eager blackholing can improve parallel performance by reducing the chances that two threads perform the same computation. However, it has a cost: one extra memory write per thunk entry. To get the best results, any code which may be executed in parallel should be compiled with eager blackholing turned on. But since there's a cost for sequential code, we make it optional and turn it on for the parallel package only. It might be a good idea to compile applications (or modules) with parallel code in with -feager-blackholing. ToDo: document -feager-blackholing.
* add readTVarIO :: TVar a -> IO aSimon Marlow2008-10-101-0/+2
|
* Move the context_switch flag into the CapabilitySimon Marlow2008-09-191-0/+1
| | | | | Fixes a long-standing bug that could in some cases cause sub-optimal scheduling behaviour.
* Add a write barrier to the TSO link field (#1589)Simon Marlow2008-04-161-1/+1
|
* Fix warnings in main/ConstantsIan Lynagh2008-03-251-4/+6
|
* FIX recent PPC crashes introduced by the pointer-tagging patch (I hope)Simon Marlow2007-08-011-4/+0
| | | | | | | | There was an accidental endian-dependency in changes related to RET_FUN. The changes in question weren't strictly necessary - they were left over from the original workaround for the compacting GC problems, so I've just reverted those changes in this patch, which should hopefully fix the PPC problems.
* Pointer TaggingSimon Marlow2007-07-271-0/+4
| | | | | | | | | | | | | | | | | | | | | | This patch implements pointer tagging as per our ICFP'07 paper "Faster laziness using dynamic pointer tagging". It improves performance by 10-15% for most workloads, including GHC itself. The original patches were by Alexey Rodriguez Yakushev <mrchebas@gmail.com>, with additions and improvements by me. I've re-recorded the development as a single patch. The basic idea is this: we use the low 2 bits of a pointer to a heap object (3 bits on a 64-bit architecture) to encode some information about the object pointed to. For a constructor, we encode the "tag" of the constructor (e.g. True vs. False), for a function closure its arity. This enables some decisions to be made without dereferencing the pointer, which speeds up some common operations. In particular it enables us to avoid costly indirect jumps in many cases. More information in the commentary: http://hackage.haskell.org/trac/ghc/wiki/Commentary/Rts/HaskellExecution/PointerTagging
* Use %d rather than %zd on WindowsIan Lynagh2007-06-161-2/+8
|
* Fix size mismatch errors in mkDerivedConstants.cIan Lynagh2007-06-151-3/+3
|
* remove the ITBL_SIZE constants which were wrong, but fortunately unusedSimon Marlow2007-04-171-7/+0
|
* Remove the itbls field of BCO, put itbls in with the literalsSimon Marlow2007-02-271-1/+0
| | | | This is a simplification & minor optimisation for GHCi
* Lightweight ticky-ticky profilingKirsten Chevalier2007-02-071-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following changes restore ticky-ticky profiling to functionality from its formerly bit-rotted state. Sort of. (It got bit-rotted as part of the switch to the C-- back-end.) The way that ticky-ticky is supposed to work is documented in Section 5.7 of the GHC manual (though the manual doesn't mention that it hasn't worked since sometime around 6.0, alas). Changes from this are as follows (which I'll document on the wiki): * In the past, you had to build all of the libraries with way=t in order to use ticky-ticky, because it entailed a different closure layout. No longer. You still need to do make way=t in rts/ in order to build the ticky RTS, but you should now be able to mix ticky and non-ticky modules. * Some of the counters that worked in the past aren't implemented yet. I was originally just trying to get entry counts to work, so those should be correct. The list of counters was never documented in the first place, so I hope it's not too much of a disaster that some don't appear anymore. Someday, someone (perhaps me) should document all the counters and what they do. For now, all of the counters are either accurate (or at least as accurate as they always were), zero, or missing from the ticky profiling report altogether. This hasn't been particularly well-tested, but these changes shouldn't affect anything except when compiling with -fticky-ticky (famous last words...) Implementation details: I got rid of StgTicky.h, which in the past had the macros and declarations for all of the ticky counters. Now, those macros are defined in Cmm.h. StgTicky.h was still there for inclusion in C code. Now, any remaining C code simply cannot call the ticky macros -- or rather, they do call those macros, but from the perspective of C code, they're defined as no-ops. (This shouldn't be too big a problem.) I added a new file TickyCounter.h that has all the declarations for ticky counters, as well as dummy macros for use in C code. Someday, these declarations should really be automatically generated, since they need to be kept consistent with the macros defined in Cmm.h. Other changes include getting rid of the header that was getting added to closures before, and getting rid of various code having to do with eager blackholing and permanent indirections (the changes under compiler/ and rts/Updates.*).
* Split GC.c, and move storage manager into sm/ directorySimon Marlow2006-10-241-0/+1
| | | | | | | | | | | | | | | | | In preparation for parallel GC, split up the monolithic GC.c file into smaller parts. Also in this patch (and difficult to separate, unfortunatley): - Don't include Stable.h in Rts.h, instead just include it where necessary. - consistently use STATIC_INLINE in source files, and INLINE_HEADER in header files. STATIC_INLINE is now turned off when DEBUG is on, to make debugging easier. - The GC no longer takes the get_roots function as an argument. We weren't making use of this generalisation.
* STM invariantstharris@microsoft.com2006-10-071-1/+12
|
* new RTS flag: -V to modify the resolution of the RTS timerIan Lynagh2006-09-051-0/+2
| | | | | | | | | Fixed version of an old patch by Simon Marlow. His description read: Also, now an arbitrarily short context switch interval may now be specified, as we increase the RTS ticker's resolution to match the requested context switch interval. This also applies to +RTS -i (heap profiling) and +RTS -I (the idle GC timer). +RTS -V is actually only required for increasing the resolution of the profile timer.
* Replace inline C functions with C-- macros in .cmm codeSimon Marlow2006-06-291-0/+1
| | | | So that we can build the RTS with the NCG.
* fix up slop-overwriting for THUNK_SELECTORS in DEBUG modeSimon Marlow2006-06-271-0/+2
|
* Asynchronous exception support for SMPSimon Marlow2006-06-161-0/+4
| | | | | | | | | | | | | | | | | This patch makes throwTo work with -threaded, and also refactors large parts of the concurrency support in the RTS to clean things up. We have some new files: RaiseAsync.{c,h} asynchronous exception support Threads.{c,h} general threading-related utils Some of the contents of these new files used to be in Schedule.c, which is smaller and cleaner as a result of the split. Asynchronous exception support in the presence of multiple running Haskell threads is rather tricky. In fact, to my annoyance there are still one or two bugs to track down, but the majority of the tests run now.
* Reorganisation of the source treeSimon Marlow2006-04-071-0/+404
Most of the other users of the fptools build system have migrated to Cabal, and with the move to darcs we can now flatten the source tree without losing history, so here goes. The main change is that the ghc/ subdir is gone, and most of what it contained is now at the top level. The build system now makes no pretense at being multi-project, it is just the GHC build system. No doubt this will break many things, and there will be a period of instability while we fix the dependencies. A straightforward build should work, but I haven't yet fixed binary/source distributions. Changes to the Building Guide will follow, too.