summaryrefslogtreecommitdiff
path: root/compiler/codeGen/StgCmmForeign.hs
Commit message (Collapse)AuthorAgeFilesLines
* Per-thread allocation counters and limitsSimon Marlow2014-11-121-72/+202
| | | | | | | | This reverts commit f0fcc41d755876a1b02d1c7c79f57515059f6417. New changes: now works on 32-bit platforms too. I added some basic support for 64-bit subtraction and comparison operations to the x86 NCG.
* Make Applicative a superclass of MonadAustin Seipp2014-09-091-0/+5
| | | | | | | | | | | | | | | | | | | | | Summary: This includes pretty much all the changes needed to make `Applicative` a superclass of `Monad` finally. There's mostly reshuffling in the interests of avoid orphans and boot files, but luckily we can resolve all of them, pretty much. The only catch was that Alternative/MonadPlus also had to go into Prelude to avoid this. As a result, we must update the hsc2hs and haddock submodules. Signed-off-by: Austin Seipp <austin@well-typed.com> Test Plan: Build things, they might not explode horribly. Reviewers: hvr, simonmar Subscribers: simonmar Differential Revision: https://phabricator.haskell.org/D13
* Add LANGUAGE pragmas to compiler/ source filesHerbert Valerio Riedel2014-05-151-0/+2
| | | | | | | | | | | | | | | | | | In some cases, the layout of the LANGUAGE/OPTIONS_GHC lines has been reorganized, while following the convention, to - place `{-# LANGUAGE #-}` pragmas at the top of the source file, before any `{-# OPTIONS_GHC #-}`-lines. - Moreover, if the list of language extensions fit into a single `{-# LANGUAGE ... -#}`-line (shorter than 80 characters), keep it on one line. Otherwise split into `{-# LANGUAGE ... -#}`-lines for each individual language extension. In both cases, try to keep the enumeration alphabetically ordered. (The latter layout is preferable as it's more diff-friendly) While at it, this also replaces obsolete `{-# OPTIONS ... #-}` pragma occurences by `{-# OPTIONS_GHC ... #-}` pragmas.
* Revert "Per-thread allocation counters and limits"Simon Marlow2014-05-041-196/+72
| | | | | | | | Problems were found on 32-bit platforms, I'll commit again when I have a fix. This reverts the following commits: 54b31f744848da872c7c6366dea840748e01b5cf b0534f78a73f972e279eed4447a5687bd6a8308e
* Per-thread allocation counters and limitsSimon Marlow2014-05-021-72/+196
| | | | | | | | | | | | | | | | | | | | | | | This tracks the amount of memory allocation by each thread in a counter stored in the TSO. Optionally, when the counter drops below zero (it counts down), the thread can be sent an asynchronous exception: AllocationLimitExceeded. When this happens, given a small additional limit so that it can handle the exception. See documentation in GHC.Conc for more details. Allocation limits are similar to timeouts, but - timeouts use real time, not CPU time. Allocation limits do not count anything while the thread is blocked or in foreign code. - timeouts don't re-trigger if the thread catches the exception, allocation limits do. - timeouts can catch non-allocating loops, if you use -fno-omit-yields. This doesn't work for allocation limits. I couldn't measure any impact on benchmarks with these changes, even for nofib/smp.
* Add SmallArray# and SmallMutableArray# typesJohan Tibell2014-03-291-1/+4
| | | | | | | | | | | | | | | These array types are smaller than Array# and MutableArray# and are faster when the array size is small, as they don't have the overhead of a card table. Having no card table reduces the closure size with 2 words in the typical small array case and leads to less work when updating or GC:ing the array. Reduces both the runtime and memory allocation by 8.8% on my insert benchmark for the HashMap type in the unordered-containers package, which makes use of lots of small arrays. With tuned GC settings (i.e. `+RTS -A6M`) the runtime reduction is 15%. Fixes #8923.
* Explicit import lists for StgCmmProf.Edward Z. Yang2013-09-011-1/+1
| | | | Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* Fix a bug in stack layout with safe foreign calls (#8083)Simon Marlow2013-07-241-1/+2
| | | | | | | We weren't properly tracking the number of stack arguments in the continuation of a foreign call. It happened to work when the continuation was not a join point, but when it was a join point we were using the wrong amount of stack fixup.
* Satisfy the invariant on CmmUnsafeForeignCall argumentsSimon Marlow2013-03-061-30/+23
| | | | | | | There was potentially a bug here, but no actual failures were identified in the wild. See Note [Register Parameter Passing]
* loadThreadState should set HpAlloc=0Simon Marlow2012-11-051-1/+7
|
* Attach global register liveness info to Cmm procedures.Geoffrey Mainland2012-10-301-1/+1
| | | | | | | All Cmm procedures now include the set of global registers that are live on procedure entry, i.e., the global registers used to pass arguments to the procedure. Only global registers that are use to pass arguments are included in this list.
* Some alpha renamingIan Lynagh2012-10-161-2/+2
| | | | | Mostly d -> g (matching DynFlag -> GeneralFlag). Also renamed if* to when*, matching the Haskell if/when names
* Produce new-style Cmm from the Cmm parserSimon Marlow2012-10-081-32/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main change here is that the Cmm parser now allows high-level cmm code with argument-passing and function calls. For example: foo ( gcptr a, bits32 b ) { if (b > 0) { // we can make tail calls passing arguments: jump stg_ap_0_fast(a); } return (x,y); } More details on the new cmm syntax are in Note [Syntax of .cmm files] in CmmParse.y. The old syntax is still more-or-less supported for those occasional code fragments that really need to explicitly manipulate the stack. However there are a couple of differences: it is now obligatory to give a list of live GlobalRegs on every jump, e.g. jump %ENTRY_CODE(Sp(0)) [R1]; Again, more details in Note [Syntax of .cmm files]. I have rewritten most of the .cmm files in the RTS into the new syntax, except for AutoApply.cmm which is generated by the genapply program: this file could be generated in the new syntax instead and would probably be better off for it, but I ran out of enthusiasm. Some other changes in this batch: - The PrimOp calling convention is gone, primops now use the ordinary NativeNodeCall convention. This means that primops and "foreign import prim" code must be written in high-level cmm, but they can now take more than 10 arguments. - CmmSink now does constant-folding (should fix #7219) - .cmm files now go through the cmmPipeline, and as a result we generate better code in many cases. All the object files generated for the RTS .cmm files are now smaller. Performance should be better too, but I haven't measured it yet. - RET_DYN frames are removed from the RTS, lots of code goes away - we now have some more canned GC points to cover unboxed-tuples with 2-4 pointers, which will reduce code size a little.
* Move wORD_SIZE into platformConstantsIan Lynagh2012-09-161-3/+2
|
* Move some more constants into platformConstantsIan Lynagh2012-09-141-1/+1
|
* Move more constants to platformConstantsIan Lynagh2012-09-141-1/+1
|
* Use oFFSET_* from platformConstants rather than ConstantsIan Lynagh2012-09-131-7/+7
|
* Pass DynFlags down to wordWidthIan Lynagh2012-09-121-4/+4
|
* Pass DynFlags down to gcWordIan Lynagh2012-09-121-3/+3
|
* Pass DynFlags down to bWordIan Lynagh2012-09-121-41/+43
| | | | | | I've switched to passing DynFlags rather than Platform, as (a) it's simpler to not have to extract targetPlatform in so many places, and (b) it may be useful to have DynFlags around in future.
* Cleanup: add mkIntExpr and zeroExpr utilsSimon Marlow2012-08-311-1/+1
|
* Define callerSaves for all platformsIan Lynagh2012-08-071-1/+2
| | | | | | | | This means that we now generate the same code whatever platform we are on, which should help avoid changes on one platform breaking the build on another. It's also another step towards full cross-compilation.
* Add "Unregisterised" as a field in the settings fileIan Lynagh2012-08-071-1/+2
| | | | | | To explicitly choose whether you want an unregisterised build you now need to use the "--enable-unregisterised"/"--disable-unregisterised" configure flags.
* Use "ReturnedTo" when generating safe foreign callsSimon Marlow2012-08-061-10/+23
|
* Explicitly share some return continuationsSimon Marlow2012-08-021-1/+2
| | | | | | | Instead of relying on common-block-elimination to share return continuations in the common case (case-alternative heap checks) we do it explicitly. This isn't hard to do, is more robust, and saves some compilation time. Full commentary in Note [sharing continuations].
* Make -fscc-profiling a dynamic flagIan Lynagh2012-07-241-29/+33
| | | | All the flags that 'ways' imply are now dynamic
* Merge remote-tracking branch 'origin/master' into newcgSimon Marlow2012-07-041-2/+5
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * origin/master: (756 commits) don't crash if argv[0] == NULL (#7037) -package P was loading all versions of P in GHCi (#7030) Add a Note, copying text from #2437 improve the --help docs a bit (#7008) Copy Data.HashTable's hashString into our Util module Build fix Build fixes Parse error: suggest brackets and indentation. Don't build the ghc DLL on Windows; works around trac #5987 On Windows, detect if DLLs have too many symbols; trac #5987 Add some more Integer rules; fixes #6111 Fix PA dfun construction with silent superclass args Add silent superclass parameters to the vectoriser Add silent superclass parameters (again) Mention Generic1 in the user's guide Make the GHC API a little more powerful. tweak llvm version warning message New version of the patch for #5461. Fix Word64ToInteger conversion rule. Implemented feature request on reconfigurable pretty-printing in GHCi (#5461) ... Conflicts: compiler/basicTypes/UniqSupply.lhs compiler/cmm/CmmBuildInfoTables.hs compiler/cmm/CmmLint.hs compiler/cmm/CmmOpt.hs compiler/cmm/CmmPipeline.hs compiler/cmm/CmmStackLayout.hs compiler/cmm/MkGraph.hs compiler/cmm/OldPprCmm.hs compiler/codeGen/CodeGen.lhs compiler/codeGen/StgCmm.hs compiler/codeGen/StgCmmBind.hs compiler/codeGen/StgCmmLayout.hs compiler/codeGen/StgCmmUtils.hs compiler/main/CodeOutput.lhs compiler/main/HscMain.hs compiler/nativeGen/AsmCodeGen.lhs compiler/simplStg/SimplStg.lhs
| * Support code generation for unboxed-tuple function argumentsunboxed-tuple-arguments2Max Bolingbroke2012-05-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | This is done by a 'unarisation' pre-pass at the STG level which translates away all (live) binders binding something of unboxed tuple type. This has the following knock-on effects: * The subkind hierarchy is vastly simplified (no UbxTupleKind or ArgKind) * Various relaxed type checks in typechecker, 'foreign import prim' etc * All case binders may be live at the Core level
| * Implement "value" imports with the CAPIIan Lynagh2012-02-261-1/+3
| | | | | | | | | | | | This allows us to import values (i.e. non-functions) with the CAPI. This means we can access values even if (on some or all platforms) they are simple #defines.
* | Lower safe foreign calls in the new CmmLayoutStackSimon Marlow2012-03-061-9/+102
| | | | | | | | | | | | | | | | We also generate much better code for safe foreign calls (and maybe also unsafe foreign calls) than previously. See the two new Notes: Note [lower safe foreign calls] Note [safe foreign call convention]
* | Merge remote-tracking branch 'laptop/newcg' into newcgMe at work2012-02-141-1/+1
|\ \
| * \ Merge remote-tracking branch 'origin/master' into newcgSimon Marlow2012-02-131-1/+1
| |\ \ | | |/ | | | | | | | | | | | | | | | | | | Conflicts: compiler/cmm/CmmLint.hs compiler/cmm/OldCmm.hs compiler/codeGen/CgMonad.lhs compiler/main/CodeOutput.lhs
| | * Rename the CCCS field of StgTSO so as not to conflict with the CCCS ↵Simon Marlow2012-01-051-1/+1
| | | | | | | | | | | | | | | | | | pseudo-register Needed by #5357
* | | Fix an SRT-related bugSimon Marlow2012-02-141-10/+4
|/ / | | | | | | | | | | | | | | We were using the SRT information generated by the computeSRTs pass to decide whether to add a static link field to a constructor or not, and this broke when I disabled computeSRTs for the new code generator. So I've hacked it for now to only rely on the SRT information generated by CoreToStg.
* | New stack layout algorithmSimon Marlow2012-02-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Also: - improvements to code generation: push slow-call continuations on the stack instead of generating explicit continuations - remove unused CmmInfo wrapper type (replace with CmmInfoTable) - squash Area and AreaId together, remove now-unused RegSlot - comment out old unused stack-allocation code that no longer compiles after removal of RegSlot
* | Different implementation of MkGraphSimon Marlow2012-01-251-4/+5
|/
* Make profiling work with multiple capabilities (+RTS -N)Simon Marlow2011-11-291-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | This means that both time and heap profiling work for parallel programs. Main internal changes: - CCCS is no longer a global variable; it is now another pseudo-register in the StgRegTable struct. Thus every Capability has its own CCCS. - There is a new built-in CCS called "IDLE", which records ticks for Capabilities in the idle state. If you profile a single-threaded program with +RTS -N2, you'll see about 50% of time in "IDLE". - There is appropriate locking in rts/Profiling.c to protect the shared cost-centre-stack data structures. This patch does enough to get it working, I have cut one big corner: the cost-centre-stack data structure is still shared amongst all Capabilities, which means that multiple Capabilities will race when updating the "allocations" and "entries" fields of a CCS. Not only does this give unpredictable results, but it runs very slowly due to cache line bouncing. It is strongly recommended that you use -fno-prof-count-entries to disable the "entries" count when profiling parallel programs. (I shall add a note to this effect to the docs).
* Whitespace only in codeGen/StgCmmForeign.hsIan Lynagh2011-11-261-99/+92
|
* Use -fwarn-tabs when validatingIan Lynagh2011-11-041-0/+7
| | | | | We only use it for "compiler" sources, i.e. not for libraries. Many modules have a -fno-warn-tabs kludge for now.
* Snapshot of codegen refactoring to share with simonpjSimon Marlow2011-08-251-2/+1
|
* Remove type synonyms for CmmFormals, CmmActuals (and hinted versions).Edward Z. Yang2011-06-131-8/+8
| | | | Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* Merge in new code generator branch.Simon Marlow2011-01-241-32/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | This changes the new code generator to make use of the Hoopl package for dataflow analysis. Hoopl is a new boot package, and is maintained in a separate upstream git repository (as usual, GHC has its own lagging darcs mirror in http://darcs.haskell.org/packages/hoopl). During this merge I squashed recent history into one patch. I tried to rebase, but the history had some internal conflicts of its own which made rebase extremely confusing, so I gave up. The history I squashed was: - Update new codegen to work with latest Hoopl - Add some notes on new code gen to cmm-notes - Enable Hoopl lag package. - Add SPJ note to cmm-notes - Improve GC calls on new code generator. Work in this branch was done by: - Milan Straka <fox@ucw.cz> - John Dias <dias@cs.tufts.edu> - David Terei <davidterei@gmail.com> Edward Z. Yang <ezyang@mit.edu> merged in further changes from GHC HEAD and fixed a few bugs.
* Implement stack chunks and separate TSO/STACK objectsSimon Marlow2010-12-151-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes two changes to the way stacks are managed: 1. The stack is now stored in a separate object from the TSO. This means that it is easier to replace the stack object for a thread when the stack overflows or underflows; we don't have to leave behind the old TSO as an indirection any more. Consequently, we can remove ThreadRelocated and deRefTSO(), which were a pain. This is obviously the right thing, but the last time I tried to do it it made performance worse. This time I seem to have cracked it. 2. Stacks are now represented as a chain of chunks, rather than a single monolithic object. The big advantage here is that individual chunks are marked clean or dirty according to whether they contain pointers to the young generation, and the GC can avoid traversing clean stack chunks during a young-generation collection. This means that programs with deep stacks will see a big saving in GC overhead when using the default GC settings. A secondary advantage is that there is much less copying involved as the stack grows. Programs that quickly grow a deep stack will see big improvements. In some ways the implementation is simpler, as nothing special needs to be done to reclaim stack as the stack shrinks (the GC just recovers the dead stack chunks). On the other hand, we have to manage stack underflow between chunks, so there's a new stack frame (UNDERFLOW_FRAME), and we now have separate TSO and STACK objects. The total amount of code is probably about the same as before. There are new RTS flags: -ki<size> Sets the initial thread stack size (default 1k) Egs: -ki4k -ki2m -kc<size> Sets the stack chunk size (default 32k) -kb<size> Sets the stack chunk buffer size (default 1k) -ki was previously called just -k, and the old name is still accepted for backwards compatibility. These new options are documented.
* Interruptible FFI calls with pthread_kill and CancelSynchronousIO. v4Edward Z. Yang2010-09-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | This is patch that adds support for interruptible FFI calls in the form of a new foreign import keyword 'interruptible', which can be used instead of 'safe' or 'unsafe'. Interruptible FFI calls act like safe FFI calls, except that the worker thread they run on may be interrupted. Internally, it replaces BlockedOnCCall_NoUnblockEx with BlockedOnCCall_Interruptible, and changes the behavior of the RTS to not modify the TSO_ flags on the event of an FFI call from a thread that was interruptible. It also modifies the bytecode format for foreign call, adding an extra Word16 to indicate interruptibility. The semantics of interruption vary from platform to platform, but the intent is that any blocking system calls are aborted with an error code. This is most useful for making function calls to system library functions that support interrupting. There is no support for pre-Vista Windows. There is a partner testsuite patch which adds several tests for this functionality.
* Refactor PackageTarget back into StaticTargetBen.Lippmeier@anu.edu.au2010-01-041-5/+11
|
* Tag ForeignCalls with the package they correspond toBen.Lippmeier@anu.edu.au2010-01-021-1/+1
|
* Keep Touch'd variables live through the back enddias@cs.tufts.edu2009-09-181-2/+2
| | | | | | | When we used derived pointers into the middle of an object, we need to keep the pointer to the start of the object live. We use a "fat machine instruction" with the primitive MO_Touch to propagate this information through the back end.
* Remove old 'foreign import dotnet' codeSimon Marlow2009-07-271-3/+0
| | | | It still lives in darcs, if anyone wants to revive it sometime.
* Calling convention bug and cleanupdias@eecs.tufts.edu2009-03-171-1/+2
| | | | | - yet another wrong calling convention; this one was a special case for returning one value.
* Inconsistent type and arguments in safe foreign calls...dias@eecs.tufts.edu2009-03-161-15/+15
| | | | | - The function argument was stripped from the argument list but not from the type. Now they're both stripped.