summaryrefslogtreecommitdiff
path: root/includes/rts
Commit message (Collapse)AuthorAgeFilesLines
...
* Implement atomicReadMVar, fixing #4001.Edward Z. Yang2013-07-091-12/+13
| | | | | | | | | We add the invariant to the MVar blocked threads queue that threads blocked on an atomic read are always at the front of the queue. This invariant is easy to maintain, since takers are only ever added to the end of the queue. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* Fix build on OS XIan Lynagh2013-06-221-0/+4
|
* Optimise lockClosure when n_capabilities == 1; fixes #693Ian Lynagh2013-06-151-6/+22
| | | | Based on a patch from Yuras Shumovich.
* Maintain per-generation lists of weak pointers (#7847)Takano Akio2013-06-151-0/+3
|
* Allow multiple C finalizers to be attached to a Weak#Takano Akio2013-06-151-4/+8
| | | | | | | | | | | | | The commit replaces mkWeakForeignEnv# with addCFinalizerToWeak#. This new primop mutates an existing Weak# object and adds a new C finalizer to it. This change removes an invariant in MarkWeak.c, namely that the relative order of Weak# objects in the list needs to be preserved across GC. This makes it easier to split the list into per-generation structures. The patch also removes a race condition between two threads calling finalizeWeak# on the same WEAK object at that same time.
* Whitespace only in rts/storage/SMPClosureOps.hIan Lynagh2013-06-141-7/+7
|
* use libffi for iOS adjustors; fixes #7718Ian Lynagh2013-06-081-2/+5
| | | | Based on a patch from Stephen Blackheath.
* fix comment (#7907)Simon Marlow2013-05-211-1/+1
|
* Expose __word_encode{Float,Double}; fixes integer-simple buildIan Lynagh2013-05-191-0/+2
|
* Move the genSym stuff from rts into compilerIan Lynagh2013-05-171-5/+0
| | | | | It's no longer used by Data.Unique, so there's no need to have it in rts any more.
* ticky enhancementsNicolas Frisby2013-03-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * the new StgCmmArgRep module breaks a dependency cycle; I also untabified it, but made no real changes * updated the documentation in the wiki and change the user guide to point there * moved the allocation enters for ticky and CCS to after the heap check * I left LDV where it was, which was before the heap check at least once, since I have no idea what it is * standardized all (active?) ticky alloc totals to bytes * in order to avoid double counting StgCmmLayout.adjustHpBackwards no longer bumps ALLOC_HEAP_ctr * I resurrected the SLOW_CALL counters * the new module StgCmmArgRep breaks cyclic dependency between Layout and Ticky (which the SLOW_CALL counters cause) * renamed them SLOW_CALL_fast_<pattern> and VERY_SLOW_CALL * added ALLOC_RTS_ctr and _tot ticky counters * eg allocation by Storage.c:allocate or a BUILD_PAP in stg_ap_*_info * resurrected ticky counters for ALLOC_THK, ALLOC_PAP, and ALLOC_PRIM * added -ticky and -DTICKY_TICKY in ways.mk for debug ways * added a ticky counter for total LNE entries * new flags for ticky: -ticky-allocd -ticky-dyn-thunk -ticky-LNE * all off by default * -ticky-allocd: tracks allocation *of* closure in addition to allocation *by* that closure * -ticky-dyn-thunk tracks dynamic thunks as if they were functions * -ticky-LNE tracks LNEs as if they were functions * updated the ticky report format, including making the argument categories (more?) accurate again * the printed name for things in the report include the unique of their ticky parent as well as if they are not top-level
* Closures must be zeroed even without LDV-profiling. Partially fixes #7747Edward Z. Yang2013-03-071-4/+0
| | | | Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* Update source pointer.Edward Z. Yang2013-03-021-1/+1
| | | | Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* Only emit %write_barrier primitive for THREADED_RTSGabor Greif2013-02-261-2/+2
|
* Separate StablePtr and StableName tables (#7674)Simon Marlow2013-02-141-7/+10
| | | | To improve performance of StablePtr.
* Simplify the allocation stats accountingSimon Marlow2013-02-141-1/+1
| | | | | | | | | | | We were doing it in two different ways and asserting that the results were the same. In most cases they were, but I found one case where they weren't: the GC itself allocates some memory for running finalizers, and this memory was accounted for one way but not the other. It was simpler to remove the old way of counting allocation that to try to fix it up, so I did that.
* Added RTS hooks for the timer manager.Andreas Voellmy2013-02-111-0/+2
|
* Always pass vector values on the stack.Geoffrey Mainland2013-02-011-17/+18
| | | | | Vector values are now always passed on the stack. This isn't particularly efficient, but it will have to do for now.
* typosGabor Greif2013-01-301-1/+1
|
* fix warningsSimon Marlow2013-01-301-1/+0
|
* STM: Only wake up onceBen Gamari2013-01-301-1/+3
| | | | | | | | | | | Previously, threads blocked on an STM retry would be sent a wakeup message each time an unpark was requested. This could result in the accumulation of a large number of wake-up messages, which would slow wake-up once the sleeping thread is finally scheduled. Here, we introduce a new closure type, STM_AWOKEN, which marks a TSO which has been sent a wake-up message, allowing us to send only one wakeup.
* Expose genericRaise; fixes signals004(dyn) no OS X 32Ian Lynagh2013-01-171-0/+3
|
* Expose the prototype for getMonotonicNSecIan Lynagh2013-01-171-0/+19
| | | | Fixes T3807 on OS X 32.
* typoGabor Greif2012-12-191-2/+2
|
* Make enabled_capabilities visible (fixes dynamic linking)Simon Marlow2012-12-131-0/+3
|
* Add a write barrier for TVAR closuresSimon Marlow2012-11-161-18/+19
| | | | | | | | | | This improves GC performance when there are a lot of TVars in the heap. For instance, a TChan with a lot of elements causes a massive GC drag without this patch. There's more to do - several other STM closure types don't have write barriers, so GC performance when there are a lot of threads blocked on STM isn't great. But fixing the problem for TVar is a good start.
* Don't include a (void *) cast in BLOCK_ROUND_UPIan Lynagh2012-11-131-1/+1
| | | | | | All uses of it cast the result anyway. However, DeriveConstants needs it to not include the cast, as (void *) casts can't be used in constant expressions.
* The shape of StgTVar should not depend on THREADED_RTSSimon Marlow2012-11-011-2/+0
| | | | | By shear luck I think this didn't lead to any actual runtime crashes, but it did cause some problems for debugging.
* Draw STG F and D registers from the same pool of available SSE registers on ↵Geoffrey Mainland2012-10-301-2/+3
| | | | | | | | | | | | x86-64. On x86-64 F and D registers are both drawn from SSE registers, so there is no reason not to draw them from the same pool of available SSE registers. This means that whereas previously a function could only receive two Double arguments in registers even if it did not have any Float arguments, now it can receive up to 6 arguments that are any mix of Float and Double in registers. This patch breaks the LLVM back end. The next patch will fix this breakage.
* Add a new traceMarker# primop for use in profiling outputDuncan Coutts2012-10-151-3/+3
| | | | | | | | | In time-based profiling visualisations (e.g. heap profiles and ThreadScope) it would be useful to be able to mark particular points in the execution and have those points in time marked in the visualisation. The traceMarker# primop currently emits an event into the eventlog. In principle it could be extended to do something in the heap profiling too.
* Produce new-style Cmm from the Cmm parserSimon Marlow2012-10-086-150/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main change here is that the Cmm parser now allows high-level cmm code with argument-passing and function calls. For example: foo ( gcptr a, bits32 b ) { if (b > 0) { // we can make tail calls passing arguments: jump stg_ap_0_fast(a); } return (x,y); } More details on the new cmm syntax are in Note [Syntax of .cmm files] in CmmParse.y. The old syntax is still more-or-less supported for those occasional code fragments that really need to explicitly manipulate the stack. However there are a couple of differences: it is now obligatory to give a list of live GlobalRegs on every jump, e.g. jump %ENTRY_CODE(Sp(0)) [R1]; Again, more details in Note [Syntax of .cmm files]. I have rewritten most of the .cmm files in the RTS into the new syntax, except for AutoApply.cmm which is generated by the genapply program: this file could be generated in the new syntax instead and would probably be better off for it, but I ran out of enthusiasm. Some other changes in this batch: - The PrimOp calling convention is gone, primops now use the ordinary NativeNodeCall convention. This means that primops and "foreign import prim" code must be written in high-level cmm, but they can now take more than 10 arguments. - CmmSink now does constant-folding (should fix #7219) - .cmm files now go through the cmmPipeline, and as a result we generate better code in many cases. All the object files generated for the RTS .cmm files are now smaller. Performance should be better too, but I haven't measured it yet. - RET_DYN frames are removed from the RTS, lots of code goes away - we now have some more canned GC points to cover unboxed-tuples with 2-4 pointers, which will reduce code size a little.
* Another overhaul of the recent_activity / idle GC handling (#5991)Simon Marlow2012-09-241-0/+1
| | | | | | | | | | | | | | | Improvements: - we now turn off the timer signal in the non-threaded RTS after idleGCDelay. This should make the xmonad users on #5991 happy. - we now turn off the timer signal after idleGCDelay even if the idle GC is disabled with +RTS -I0. - we now do *not* turn off the timer when profiling. - more comments to explain the meaning of the various ACTIVITY_* values
* Remove a redundant castIan Lynagh2012-09-211-1/+1
|
* Convert more RTS macros to functionsIan Lynagh2012-09-211-5/+11
| | | | Object sizes still unchanged.
* Convert more RTS macros to functionsIan Lynagh2012-09-211-5/+12
| | | | No size changes in the non-debug object files
* Cache the result of countOccupied(gen->large_objects) as gen->n_large_words ↵Simon Marlow2012-09-211-0/+1
| | | | | | | | | (#7257) The program in #7257 was spending 90% of its time counting the live data in gen->large_objects. We already avoid doing this for small objects, but in this example the old generation was full of large objects (actually pinned ByteStrings).
* Lots of nat -> StgWord changesSimon Marlow2012-09-073-10/+10
|
* Deprecate lnat, and use StgWord insteadSimon Marlow2012-09-076-15/+17
| | | | | | | | | | | | lnat was originally "long unsigned int" but we were using it when we wanted a 64-bit type on a 64-bit machine. This broke on Windows x64, where long == int == 32 bits. Using types of unspecified size is bad, but what we really wanted was a type with N bits on an N-bit machine. StgWord is exactly that. lnat was mentioned in some APIs that clients might be using (e.g. StackOverflowHook()), so we leave it defined but with a comment to say that it's deprecated.
* Fix return type of FUN_INFO_PTR_TO_STRUCT.Erik de Castro Lopo2012-08-281-1/+1
| | | | Return type was correct when TABLES_NEXT_TO_CODE was defined.
* More CPP macros -> inline functionsIan Lynagh2012-08-251-11/+15
| | | | | | | | All the wibble seem to have cancelled out, and (non-debug) object sizes are back to where they started. I'm not 100% sure that the types are optimal, but at least now the functions have types and we can fix them if necessary.
* More CPP macros -> inline functionsIan Lynagh2012-08-251-17/+15
|
* More CPP macro -> inline functionIan Lynagh2012-08-251-2/+4
|
* Convert a couple more macros to inline functionsIan Lynagh2012-08-251-2/+7
| | | | | | | | | | This caused a couple of .o files to change size. I had a look at one, and it seems to be caused by the difference in size of these two instructions: 49 8b 5d 08 mov 0x8(%r13),%rbx 49 8b 5c 24 08 mov 0x8(%r12),%rbx (with a few nops being added or removed later in the file, presumably for alignment reasons).
* Make a function for get_itbl, rather than using a CPP macroIan Lynagh2012-08-251-6/+7
| | | | | | | | | | | | This has several advantages: * It can be called from gdb * There is more type information for the user, and type checking for the compiler * Less opportunity for things to go wrong, e.g. due to missing parentheses or repeated execution The sizes of the non-debug .o files hasn't changed (other than Inlines.o), so I'm pretty sure the compiled code is identical.
* move startProfTimer() and stopProfTimer() to the public headersSimon Marlow2012-08-211-0/+9
|
* Merge branch 'master' of darcs.haskell.org:/srv/darcs//ghcIan Lynagh2012-07-191-2/+2
|\
| * use idiomatic typeGabor Greif2012-07-181-2/+2
| |
* | Define the task-tracking eventsDuncan Coutts2012-07-101-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | Based on initial patches by Mikolaj Konarski <mikolaj@well-typed.com> These new eventlog events are to let profiling tools keep track of all the OS threads that belong to an RTS capability at any moment in time. In the RTS, OS threads correspond to the Task abstraction, so that is what we track. There are events for tasks being created, migrated between capabilities and deleted. In particular the task creation event also records the kernel thread id which lets us match up the OS thread with data collected by others tools (in the initial use case with Linux's perf tool, but in principle also with DTrace).
* | New functions to get kernel thread Id + serialisable task IdDuncan Coutts2012-07-071-2/+25
|/ | | | | | | | | | | | | | | | | | | | On most platforms the userspace thread type (e.g. pthread_t) and kernel thread id are different. Normally we don't care about kernel thread Ids, but some system tools for tracing/profiling etc report kernel ids. For example Solaris and OSX's DTrace and Linux's perf tool report kernel thread ids. To be able to match these up with RTS's OSThread we need a way to get at the kernel thread, so we add a new function for to do just that (the implementation is system-dependent). Additionally, strictly speaking the OSThreadId type, used as task ids, is not a serialisable representation. On unix OSThreadId is a typedef for pthread_t, but pthread_t is not guaranteed to be a numeric type. Indeed on some systems pthread_t is a pointer and in principle it could be a structure type. So we add another new function to get a serialisable representation of an OSThreadId. This is only for use in log files. We use the function to serialise an id of a task, with the extra feature that it works in non-threaded builds by always returning 1.
* Add getGCStatsEnabled function.Paolo Capriotti2012-06-191-0/+1
|