summaryrefslogtreecommitdiff
path: root/includes/mkDerivedConstants.c
Commit message (Collapse)AuthorAgeFilesLines
* Replace mkDerivedConstants.c with DeriveConstants.hsIan Lynagh2012-11-121-829/+0
| | | | | | | | | DeriveConstants.hs works in a cross-compilation-friendly way. Rather than running a C program that prints out the constants, we just compile a C file which has the constants are encoded in symbol sizes. We then parse the output of 'nm' to find out what the constants are. Based on work by Gabor Greif <ggreif@gmail.com>.
* Give an error if we can't find a suitable value for PRIdPTRIan Lynagh2012-11-081-1/+3
|
* define own version of PRIdPTR on platform where its not availableKarel Gardas2012-11-081-0/+10
| | | | | | Note that PRIdPTR is considered as linux-ism so it's not available on platforms like Solaris, although some other free Unix(-like) OSes apparently supports it too.
* small optimisation: inline stmNewTVar()Simon Marlow2012-11-051-0/+3
|
* Draw STG F and D registers from the same pool of available SSE registers on ↵Geoffrey Mainland2012-10-301-0/+8
| | | | | | | | | | | | x86-64. On x86-64 F and D registers are both drawn from SSE registers, so there is no reason not to draw them from the same pool of available SSE registers. This means that whereas previously a function could only receive two Double arguments in registers even if it did not have any Float arguments, now it can receive up to 6 arguments that are any mix of Float and Double in registers. This patch breaks the LLVM back end. The next patch will fix this breakage.
* Add some missing parentheses to mkDerivedConstants.cIan Lynagh2012-10-261-2/+2
| | | | | This was breaking the build on s390. Not sure why it didn't bite on any other platforms.
* Build the dynamic way by default on Linux/amd64Ian Lynagh2012-10-031-0/+8
| | | | | | | | | | | | | This required various build system changes to get the build to go through. In the inplace shell wrappers, we set LD_LIBRARY_PATH to allow programs to find their libraries. In the future, we might change the inplace tree to be the same shape as an installed tree instead. However, this would mean changing the way we do installation, as currently we use cabal's installation methods to install the libraries, but that only works if the libraries are under libraries/foo/dist-install/build/..., rather than in inplace/lib/...
* Don't put unused constants in platformConstantsIan Lynagh2012-09-201-162/+186
| | | | This makes compiling DynFlags a lot quicker
* We don't actually need a Show instance for the PlatformConstants typeIan Lynagh2012-09-201-1/+1
| | | | and creating one is quite slow
* Add the necessary REP_* constants to platformConstantsIan Lynagh2012-09-191-14/+28
|
* Add some LDV_* constants to platformConstantsIan Lynagh2012-09-191-18/+32
|
* Remove some uses of the WORDS_BIGENDIAN CPP symbolIan Lynagh2012-09-181-0/+29
|
* Move tAG_BITS into platformConstantsIan Lynagh2012-09-161-0/+3
|
* Move more constants to platformConstantsIan Lynagh2012-09-161-0/+11
|
* Move wORD_SIZE into platformConstantsIan Lynagh2012-09-161-0/+3
|
* Move some more constants into platformConstantsIan Lynagh2012-09-141-0/+10
|
* Move more constants to platformConstantsIan Lynagh2012-09-141-16/+27
|
* MAX_REAL_LONG_REG is always defined, so no need to test itIan Lynagh2012-09-141-7/+1
|
* Move more constants into platformConstantsIan Lynagh2012-09-141-0/+16
|
* Move some more constants fo platformConstantsIan Lynagh2012-09-141-0/+12
|
* Check for Int constants that are too large in mkDerivedConstantsIan Lynagh2012-09-141-0/+14
|
* Start moving other constants from (Haskell)Constants to platformConstantsIan Lynagh2012-09-141-0/+24
|
* Fix build on OS XIan Lynagh2012-09-141-1/+1
|
* Use intptr_t for offset values in mkDerivedConstantsIan Lynagh2012-09-131-2/+3
| | | | | | | This means that we get e.g. pc_OFFSET_stgEagerBlackholeInfo = -24 rather than pc_OFFSET_stgEagerBlackholeInfo = 18446744073709551592
* Remove the --gen-haskell mode of mkDerivedConstantsIan Lynagh2012-09-131-23/+2
| | | | It no longer generates anything
* Use oFFSET_* from platformConstants rather than ConstantsIan Lynagh2012-09-131-5/+3
|
* Use sIZEOF_* from platformConstants rather than ConstantsIan Lynagh2012-09-131-5/+3
|
* Add a couple more mkDerivedConstants modesIan Lynagh2012-09-131-1/+46
| | | | | | We now also generate nice wrappers for the platformConstants methods. For now it's all commented out as the definitions conflict with those in Constants.
* Make the Windows-specific part of mkDerivedConstants.c conditionalIan Lynagh2012-09-131-4/+9
| | | | | It is only generated when mode is Gen_Header; i.e. it's not used in the compiler, only the RTS.
* Add more modes to mkDerivedConstantsIan Lynagh2012-09-131-1/+62
| | | | We now generate a platformConstants file that we can read at runtime.
* Use conditionals rather than CPP in mkDerivedConstantsIan Lynagh2012-09-131-94/+147
| | | | | This means we only need to build one copy of the program, which will make life simpler as I plan to add more variants.
* Deprecate lnat, and use StgWord insteadSimon Marlow2012-09-071-1/+1
| | | | | | | | | | | | lnat was originally "long unsigned int" but we were using it when we wanted a 64-bit type on a 64-bit machine. This broke on Windows x64, where long == int == 32 bits. Using types of unspecified size is bad, but what we really wanted was a type with N bits on an N-bit machine. StgWord is exactly that. lnat was mentioned in some APIs that clients might be using (e.g. StackOverflowHook()), so we leave it defined but with a comment to say that it's deprecated.
* GHCConstants.h should not contain preprocessor definitionsGabor Greif2012-07-291-0/+25
|
* Fix warnings on Win64Ian Lynagh2012-04-261-1/+1
| | | | | | Mostly this meant getting pointer<->int conversions to use the right sizes. lnat is now size_t, rather than unsigned long, as that seems a better match for how it's used.
* Win64 warning fixIan Lynagh2012-04-241-0/+1
|
* Fix mkDerivedConstants on Win64Ian Lynagh2012-03-191-10/+10
| | | | | It was assuming that long's are word-sized, which is not the case on Win64.
* abstract away from the 'build-toolchain'-dependent sizeof(...) operatorGabor Greif2012-01-061-10/+14
| | | | | | | | | | The sizes obtained this way do not work on a target system in general. So in a future cross-compilable setup we need another way of obtaining expansions for the macros OFFSET, FIELD_SIZE and TYPE_SIZE. Guarded against accidental use of 'sizeof' by poisoning. Verified that the generated *Constants.h/hs files are unchanged.
* Rename the CCCS field of StgTSO so as not to conflict with the CCCS ↵Simon Marlow2012-01-051-1/+1
| | | | | | pseudo-register Needed by #5357
* Fix a scheduling bug in the threaded RTSSimon Marlow2011-12-011-0/+1
| | | | | | | | | | | | | | | The parallel GC was using setContextSwitches() to stop all the other threads, which sets the context_switch flag on every Capability. That had the side effect of causing every Capability to also switch threads, and since GCs can be much more frequent than context switches, this increased the context switch frequency. When context switches are expensive (because the switch is between two bound threads or a bound and unbound thread), the difference is quite noticeable. The fix is to have a separate flag to indicate that a Capability should stop and return to the scheduler, but not switch threads. I've called this the "interrupt" flag.
* Make profiling work with multiple capabilities (+RTS -N)Simon Marlow2011-11-291-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | This means that both time and heap profiling work for parallel programs. Main internal changes: - CCCS is no longer a global variable; it is now another pseudo-register in the StgRegTable struct. Thus every Capability has its own CCCS. - There is a new built-in CCS called "IDLE", which records ticks for Capabilities in the idle state. If you profile a single-threaded program with +RTS -N2, you'll see about 50% of time in "IDLE". - There is appropriate locking in rts/Profiling.c to protect the shared cost-centre-stack data structures. This patch does enough to get it working, I have cut one big corner: the cost-centre-stack data structure is still shared amongst all Capabilities, which means that multiple Capabilities will race when updating the "allocations" and "entries" fields of a CCS. Not only does this give unpredictable results, but it runs very slowly due to cache line bouncing. It is strongly recommended that you use -fno-prof-count-entries to disable the "entries" count when profiling parallel programs. (I shall add a note to this effect to the docs).
* GHC.Prim.threadStatus# now returns the cap number, and the value of TSO_LOCKEDSimon Marlow2011-03-011-0/+1
|
* Remove the per-generation mutable listsSimon Marlow2011-02-021-1/+0
| | | | Now that we use the per-capability mutable lists exclusively.
* Count allocations more accuratelySimon Marlow2010-12-211-1/+1
| | | | | | | | | | | The allocation stats (+RTS -s etc.) used to count the slop at the end of each nursery block (except the last) as allocated space, now we count the allocated words accurately. This should make allocation figures more predictable, too. This has the side effect of reducing the apparent allocations by a small amount (~1%), so remember to take this into account when looking at nofib results.
* Implement stack chunks and separate TSO/STACK objectsSimon Marlow2010-12-151-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes two changes to the way stacks are managed: 1. The stack is now stored in a separate object from the TSO. This means that it is easier to replace the stack object for a thread when the stack overflows or underflows; we don't have to leave behind the old TSO as an indirection any more. Consequently, we can remove ThreadRelocated and deRefTSO(), which were a pain. This is obviously the right thing, but the last time I tried to do it it made performance worse. This time I seem to have cracked it. 2. Stacks are now represented as a chain of chunks, rather than a single monolithic object. The big advantage here is that individual chunks are marked clean or dirty according to whether they contain pointers to the young generation, and the GC can avoid traversing clean stack chunks during a young-generation collection. This means that programs with deep stacks will see a big saving in GC overhead when using the default GC settings. A secondary advantage is that there is much less copying involved as the stack grows. Programs that quickly grow a deep stack will see big improvements. In some ways the implementation is simpler, as nothing special needs to be done to reclaim stack as the stack shrinks (the GC just recovers the dead stack chunks). On the other hand, we have to manage stack underflow between chunks, so there's a new stack frame (UNDERFLOW_FRAME), and we now have separate TSO and STACK objects. The total amount of code is probably about the same as before. There are new RTS flags: -ki<size> Sets the initial thread stack size (default 1k) Egs: -ki4k -ki2m -kc<size> Sets the stack chunk size (default 32k) -kb<size> Sets the stack chunk buffer size (default 1k) -ki was previously called just -k, and the old name is still accepted for backwards compatibility. These new options are documented.
* Catch too-large allocations and emit an error message (#4505)Simon Marlow2010-12-091-0/+2
| | | | | | | | | | | | | | | | This is a temporary measure until we fix the bug properly (which is somewhat tricky, and we think might be easier in the new code generator). For now we get: ghc-stage2: sorry! (unimplemented feature or known bug) (GHC version 7.1 for i386-unknown-linux): Trying to allocate more than 1040384 bytes. See: http://hackage.haskell.org/trac/ghc/ticket/4550 Suggestion: read data from a file instead of having large static data structures in the code.
* add numSparks# primop (#4167)Simon Marlow2010-07-201-0/+1
|
* FIX #38000 Store StgArrWords payload size in bytesAntoine Latter2010-01-011-1/+1
|
* Change the representation of the MVar blocked queueSimon Marlow2010-04-011-0/+4
| | | | | | | | | | | | | | | | | | | | | The list of threads blocked on an MVar is now represented as a list of separately allocated objects rather than being linked through the TSOs themselves. This lets us remove a TSO from the list in O(1) time rather than O(n) time, by marking the list object. Removing this linear component fixes some pathalogical performance cases where many threads were blocked on an MVar and became unreachable simultaneously (nofib/smp/threads007), or when sending an asynchronous exception to a TSO in a long list of thread blocked on an MVar. MVar performance has actually improved by a few percent as a result of this change, slightly to my surprise. This is the final cleanup in the sequence, which let me remove the old way of waking up threads (unblockOne(), MSG_WAKEUP) in favour of the new way (tryWakeupThread and MSG_TRY_WAKEUP, which is idempotent). It is now the case that only the Capability that owns a TSO may modify its state (well, almost), and this simplifies various things. More of the RTS is based on message-passing between Capabilities now.
* New implementation of BLACKHOLEsSimon Marlow2010-03-291-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This replaces the global blackhole_queue with a clever scheme that enables us to queue up blocked threads on the closure that they are blocked on, while still avoiding atomic instructions in the common case. Advantages: - gets rid of a locked global data structure and some tricky GC code (replacing it with some per-thread data structures and different tricky GC code :) - wakeups are more prompt: parallel/concurrent performance should benefit. I haven't seen anything dramatic in the parallel benchmarks so far, but a couple of threading benchmarks do improve a bit. - waking up a thread blocked on a blackhole is now O(1) (e.g. if it is the target of throwTo). - less sharing and better separation of Capabilities: communication is done with messages, the data structures are strictly owned by a Capability and cannot be modified except by sending messages. - this change will utlimately enable us to do more intelligent scheduling when threads block on each other. This is what started off the whole thing, but it isn't done yet (#3838). I'll be documenting all this on the wiki in due course.
* avoid using non-standard %zd format specifier (#3804)Simon Marlow2010-01-261-8/+2
|