summaryrefslogtreecommitdiff
path: root/compiler/codeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* More refactoring (CgRep)Simon Peyton Jones2011-08-2515-606/+164
| | | | | | * Move CgRep (private to old codgen) from SMRep to ClosureInfo * Avoid using CgRep in new codegen * Move SMRep and Bitmap from codeGen/ to cmm/
* Snapshot of codegen refactoring to share with simonpjSimon Marlow2011-08-2526-1172/+559
|
* Add popCnt# primopJohan Tibell2011-08-162-0/+31
|
* Tidy up handling of PredTys: remove dead code, move functions deconstructing ↵Max Bolingbroke2011-08-032-0/+2
| | | | them to TcType
* Add Type.tyConAppTyCon_maybe and tyConAppArgs_maybe, and use themSimon Peyton Jones2011-08-032-6/+6
| | | | | | These turn out to be a useful special case of splitTyConApp_maybe. A refactoring only; no change in behaviour
* Eliminate localiseLabelMax Bolingbroke2011-07-282-10/+8
|
* Eliminate infoLblToEntryLblMax Bolingbroke2011-07-282-30/+46
|
* There is only one flavour of LFBlackHole: make that explicitMax Bolingbroke2011-07-282-14/+12
|
* Put the info CLabel in CmmInfoTable rather than a localness flag, tidy up ↵Max Bolingbroke2011-07-284-27/+24
| | | | some info<->entry conversions
* Repair sanity of infoTableLabelFromCI in old code generatorMax Bolingbroke2011-07-284-27/+25
|
* More work towards cross-compilationIan Lynagh2011-07-152-2/+2
| | | | | | | | | | | | There's now a variant of the Outputable class that knows what platform we're targetting: class PlatformOutputable a where pprPlatform :: Platform -> a -> SDoc pprPlatformPrec :: Platform -> Rational -> a -> SDoc and various instances have had to be converted to use that class, and we pass Platform around accordingly.
* Fix the buildIan Lynagh2011-07-081-4/+5
| | | | | The seq# case in the new codegen was being shadowed by a more general case.
* Port 'Add two new primops seq# and spark#' (be54417) to new codegen.Edward Z. Yang2011-07-072-0/+32
| | | | Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* Don't export the _info symbol for the data constructor worker bindingsMax Bolingbroke2011-07-074-16/+37
| | | | | | | This is safe because GHC never generates a fast call to a data constructor worker: if the call is seen statically it will be eta-expanded and the allocation of the data will be inlined. We still need to export the _closure in case the constructor is used in an unapplied fashion.
* Refactoring: use a structured CmmStatics type rather than [CmmStatic]Max Bolingbroke2011-07-058-27/+26
| | | | | | | | | | | | | | | | | | I observed that the [CmmStatics] within CmmData uses the list in a very stylised way. The first item in the list is almost invariably a CmmDataLabel. Many parts of the compiler pattern match on this list and fail if this is not true. This patch makes the invariant explicit by introducing a structured type CmmStatics that holds the label and the list of remaining [CmmStatic]. There is one wrinkle: the x86 backend sometimes wants to output an alignment directive just before the label. However, this can be easily fixed up by parameterising the native codegen over the type of CmmStatics (though the GenCmmTop parameterisation) and using a pair (Alignment, CmmStatics) there instead. As a result, I think we will be able to remove CmmAlign and CmmDataLabel from the CmmStatic data type, thus nuking a lot of code and failing pattern matches. This change will come as part of my next patch.
* Fix Trac #5286: getPredTyDescriptionSimon Peyton Jones2011-06-302-5/+4
|
* Add two new primops:Simon Marlow2011-06-283-0/+46
| | | | | | | | | | | | | seq# :: a -> State# s -> (# State# s, a #) spark# :: a -> State# s -> (# State# s, a #) seq# is a version of seq that can be used in a State#-passing context. We will use it to implement Control.Exception.evaluate and thus fix #5129. Also we have plans to use it to fix #5262. spark# is to seq# as par is to pseq. That is, it creates a spark in a State#-passing context. We will use spark# and seq# to implement rpar and rseq respectively in an improved implementation of the Eval monad.
* codeGen: Make emitCopyByteArray less pessimisticJohan Tibell2011-06-172-19/+2
| | | | | | | | Assigning the arguments to temporaries was only needed in the case of emitCopyArray, where the arguments are alive across the call. That is not the case in emitCopyByteArray. Signed-off-by: David Terei <davidterei@gmail.com>
* Port "Add byte array copy primops" to the new code genJohan Tibell2011-06-161-0/+57
| | | | Signed-off-by: David Terei <davidterei@gmail.com>
* Add byte array copy primopsJohan Tibell2011-06-161-0/+59
| | | | Signed-off-by: David Terei <davidterei@gmail.com>
* Port "6c7d2a9 Use the new memcpy/memmove/memset MachOps" to new codegen.Edward Z. Yang2011-06-151-37/+23
| | | | Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* Use the new memcpy/memmove/memset MachOpsJohan Tibell2011-06-141-24/+25
| | | | Signed-off-by: David Terei <davidterei@gmail.com>
* Remove type synonyms for CmmFormals, CmmActuals (and hinted versions).Edward Z. Yang2011-06-136-18/+18
| | | | Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* Port "Make array copy primops inline" and related patches to new codegen.Edward Z. Yang2011-06-135-4/+234
| | | | | | | | | | The following patches were ported: d0faaa6 Fix segfault in array copy primops on 32-bit 18691d4 Make assignTemp_ less pessimistic 9c23f06 Make array copy primops inline Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* Fix segfault in array copy primops on 32-bitJohan Tibell2011-06-071-4/+4
| | | | | | | The second argument to C's memset was passed as a W8 while memset expects an int. Signed-off-by: David Terei <davidterei@gmail.com>
* Make assignTemp_ less pessimisticJohan Tibell2011-05-301-6/+10
| | | | | | assignTemp_ is intended to make sure that the expression gets assigned to a temporary in case that's needed in order to avoid a register getting trashed due to a function call.
* Make array copy primops inlineJohan Tibell2011-05-192-3/+228
|
* Amend comment per Marlow's comments.Edward Z. Yang2011-05-161-15/+16
| | | | Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* Work around lack of saving volatile registers from unsafe foreign calls.Edward Z. Yang2011-05-151-0/+61
| | | | Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* For BC labels, emit empty data section instead of empty proc.Edward Z. Yang2011-04-142-2/+3
| | | | | | | | | | | | | This fixes two bugs: - The new code generator doesn't like procedures with empty graphs, and panicked in labelAGraph. - LLVM optimizes away empty procedures but not empty data sections, so now the backwards-compatibility labels actually work with -fllvm. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* Change the way module initialisation is done (#3252, #4417)Simon Marlow2011-04-126-432/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously the code generator generated small code fragments labelled with __stginit_M for each module M, and these performed whatever initialisation was necessary for that module and recursively invoked the initialisation functions for imported modules. This appraoch had drawbacks: - FFI users had to call hs_add_root() to ensure the correct initialisation routines were called. This is a non-standard, and ugly, API. - unless we were using -split-objs, the __stginit dependencies would entail linking the whole transitive closure of modules imported, whether they were actually used or not. In an extreme case (#4387, #4417), a module from GHC might be imported for use in Template Haskell or an annotation, and that would force the whole of GHC to be needlessly linked into the final executable. So now instead we do our initialisation with C functions marked with __attribute__((constructor)), which are automatically invoked at program startup time (or DSO load-time). The C initialisers are emitted into the stub.c file. This means that every time we compile with -prof or -hpc, we now get a stub file, but thanks to #3687 that is now invisible to the user. There are some refactorings in the RTS (particularly for HPC) to handle the fact that initialisers now get run earlier than they did before. The __stginit symbols are still generated, and the hs_add_root() function still exists (but does nothing), for backwards compatibility.
* Remove debugging CmmComment from old code generator.Edward Z. Yang2011-04-111-1/+0
| | | | | | | Warning: This change seems to tickle a bug in ghc-stage1 compiler built with GHC 6.12.1 during validates. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* Minor documentation improvement about pointer tagging.Edward Z. Yang2011-04-041-3/+5
| | | | Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* Immediately tag initialization code to prevent untagged spills.Edward Z. Yang2011-03-233-6/+14
| | | | | | | | | | | | | | | | | | | | When allocating new objects on the heap, we previously returned a CmmExpr containing the heap pointer as well as the tag expression, which would be added to the code graph upon first usage. Unfortunately, this meant that untagged heap pointers living in registers might be spilled to the stack, where they interacted poorly with garbage collection (we saw this bug specifically with the compacting garbage collector.) This fix immediately tags the register containing the heap pointer, so that unless we have extremely unfriendly spill code, the new pointer will never be spilled to the stack untagged. An alternate solution might have been to modify allocDynClosure to tag the pointer upon the initial register allocation, but not all invocations of allocDynClosure tag the resulting pointer, and threading the consequent CgIdInfo for the cases that did would have been annoying.
* Fix Array sizeof primops to use the correct offset (which happens to be 0, ↵Daniel Peebles2011-02-012-2/+2
| | | | so it worked before anyway). Makes us more future-proof, at least
* Add sizeof(Mutable)Array# primitivesDaniel Peebles2011-01-262-0/+10
|
* Merge in new code generator branch.Simon Marlow2011-01-2438-492/+541
| | | | | | | | | | | | | | | | | | | | | | | | | | This changes the new code generator to make use of the Hoopl package for dataflow analysis. Hoopl is a new boot package, and is maintained in a separate upstream git repository (as usual, GHC has its own lagging darcs mirror in http://darcs.haskell.org/packages/hoopl). During this merge I squashed recent history into one patch. I tried to rebase, but the history had some internal conflicts of its own which made rebase extremely confusing, so I gave up. The history I squashed was: - Update new codegen to work with latest Hoopl - Add some notes on new code gen to cmm-notes - Enable Hoopl lag package. - Add SPJ note to cmm-notes - Improve GC calls on new code generator. Work in this branch was done by: - Milan Straka <fox@ucw.cz> - John Dias <dias@cs.tufts.edu> - David Terei <davidterei@gmail.com> Edward Z. Yang <ezyang@mit.edu> merged in further changes from GHC HEAD and fixed a few bugs.
* Implement stack chunks and separate TSO/STACK objectsSimon Marlow2010-12-152-25/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes two changes to the way stacks are managed: 1. The stack is now stored in a separate object from the TSO. This means that it is easier to replace the stack object for a thread when the stack overflows or underflows; we don't have to leave behind the old TSO as an indirection any more. Consequently, we can remove ThreadRelocated and deRefTSO(), which were a pain. This is obviously the right thing, but the last time I tried to do it it made performance worse. This time I seem to have cracked it. 2. Stacks are now represented as a chain of chunks, rather than a single monolithic object. The big advantage here is that individual chunks are marked clean or dirty according to whether they contain pointers to the young generation, and the GC can avoid traversing clean stack chunks during a young-generation collection. This means that programs with deep stacks will see a big saving in GC overhead when using the default GC settings. A secondary advantage is that there is much less copying involved as the stack grows. Programs that quickly grow a deep stack will see big improvements. In some ways the implementation is simpler, as nothing special needs to be done to reclaim stack as the stack shrinks (the GC just recovers the dead stack chunks). On the other hand, we have to manage stack underflow between chunks, so there's a new stack frame (UNDERFLOW_FRAME), and we now have separate TSO and STACK objects. The total amount of code is probably about the same as before. There are new RTS flags: -ki<size> Sets the initial thread stack size (default 1k) Egs: -ki4k -ki2m -kc<size> Sets the stack chunk size (default 32k) -kb<size> Sets the stack chunk buffer size (default 1k) -ki was previously called just -k, and the old name is still accepted for backwards compatibility. These new options are documented.
* fix ticket number (#4505)Simon Marlow2010-12-091-1/+1
|
* Catch too-large allocations and emit an error message (#4505)Simon Marlow2010-12-091-0/+10
| | | | | | | | | | | | | | | | This is a temporary measure until we fix the bug properly (which is somewhat tricky, and we think might be easier in the new code generator). For now we get: ghc-stage2: sorry! (unimplemented feature or known bug) (GHC version 7.1 for i386-unknown-linux): Trying to allocate more than 1040384 bytes. See: http://hackage.haskell.org/trac/ghc/ticket/4550 Suggestion: read data from a file instead of having large static data structures in the code.
* make a panic message more informative and suggest -dcore-lint (see #4534)Simon Marlow2010-12-011-4/+4
|
* Remove unncessary fromIntegral callssimonpj@microsoft.com2010-11-164-4/+4
|
* Remove unnecessary importsIan Lynagh2010-10-263-4/+0
|
* Follow GHC.Bool/GHC.Types mergeIan Lynagh2010-10-231-2/+2
|
* Fix some whitespaceIan Lynagh2010-10-211-16/+16
|
* Use takeUniqFromSupply in emitProcWithConventionIan Lynagh2010-10-211-2/+3
| | | | | We were using the supply's unique, and then passing the same supply to initUs_, which sounds like a bug waiting to happen.
* Interruptible FFI calls with pthread_kill and CancelSynchronousIO. v4Edward Z. Yang2010-09-192-2/+3
| | | | | | | | | | | | | | | | | | | | | | | This is patch that adds support for interruptible FFI calls in the form of a new foreign import keyword 'interruptible', which can be used instead of 'safe' or 'unsafe'. Interruptible FFI calls act like safe FFI calls, except that the worker thread they run on may be interrupted. Internally, it replaces BlockedOnCCall_NoUnblockEx with BlockedOnCCall_Interruptible, and changes the behavior of the RTS to not modify the TSO_ flags on the event of an FFI call from a thread that was interruptible. It also modifies the bytecode format for foreign call, adding an extra Word16 to indicate interruptibility. The semantics of interruption vary from platform to platform, but the intent is that any blocking system calls are aborted with an error code. This is most useful for making function calls to system library functions that support interrupting. There is no support for pre-Vista Windows. There is a partner testsuite patch which adds several tests for this functionality.
* LLVM: Stop llvm saving stg caller-save regs across C callsDavid Terei2010-07-051-1/+1
| | | | | | | | This is already handled by the Cmm code generator so LLVM is simply duplicating work. LLVM also doesn't know which ones are actually live so saves them all which causes a fair performance overhead for C calls on x64. We stop llvm saving them across the call by storing undef to them just before the call.
* FIX #38000 Store StgArrWords payload size in bytesAntoine Latter2010-01-012-12/+6
|
* Add new LLVM code generator to GHC. (Version 2)David Terei2010-06-152-28/+181
| | | | | | | | | | | | | | | | | | This was done as part of an honours thesis at UNSW, the paper describing the work and results can be found at: http://www.cse.unsw.edu.au/~pls/thesis/davidt-thesis.pdf A Homepage for the backend can be found at: http://hackage.haskell.org/trac/ghc/wiki/Commentary/Compiler/Backends/LLVM Quick summary of performance is that for the 'nofib' benchmark suite, runtimes are within 5% slower than the NCG and generally better than the C code generator. For some code though, such as the DPH projects benchmark, the LLVM code generator outperforms the NCG and C code generator by about a 25% reduction in run times.