summaryrefslogtreecommitdiff
path: root/compiler/nativeGen/AsmCodeGen.lhs
Commit message (Collapse)AuthorAgeFilesLines
...
* Add an ArchUnknown constructor to the arch typeIan Lynagh2011-05-311-0/+2
| | | | | Fixes build problems on platforms for which we did not have and Arch constructor.
* Remove most of the CPP from AsmCodeGenIan Lynagh2011-05-291-114/+170
| | | | | | | | In particular, the "#error" for platforms without a NCG is gone, which means the module should now build on all platforms again. I'm not sure if this is the nicest way to handle multiple platforms here, but it works for now.
* Fix build: Add missing import and remove unneeded #ifdef.Ben Lippmeier2011-05-151-3/+2
| | | | From Erik de Castro Lopo.
* Fix buildBen Lippmeier2011-05-121-1/+0
|
* Merge branch 'coloured-core' of https://github.com/nominolo/ghc into ↵coloured-coreIan Lynagh2011-05-081-1/+1
|\ | | | | | | coloured-core
| * Start support for coloured SDoc output.Thomas Schilling2011-04-071-1/+1
| | | | | | | | | | | | | | The SDoc type now passes around an abstract SDocContext rather than just a PprStyle which required touching a few more files. This should also make it easier to integrate DynFlags passing, so that we can get rid of global variables.
* | Change more Config tests to Platform testsIan Lynagh2011-05-081-8/+7
| |
* | The fix for #4914 was wrong and broke other things (see #5149). WeSimon Marlow2011-05-041-16/+22
| | | | | | | | | | | | | | | | | | can't emit the ffrees before a conditional jump, because we don't want to ffree the stack registers if the jump isn't taken (d'oh). This commit fixes it properly, by moving the pass that inserts the ffrees to *before* we do the jump-shortcutting which introduces the conditional non-local jumps.
* | Implement dead basic block elimination.Edward Z. Yang2011-04-301-4/+3
| | | | | | | | Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* | Remove dead Alpha native backend.Edward Z. Yang2011-04-301-7/+1
| | | | | | | | Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* | Implement jump table fix-ups for linear register allocator.Edward Z. Yang2011-04-271-1/+18
|/ | | | | | | | | | | | | We achieve this by splitting up instruction selection for case switches into two parts: the actual code generation, and the generation of the accompanying jump table. With this scheme, the jump fixup code can modify the contents of the jump table stored within the JMP_TBL (or BCTL) instruction, before the actual data section is created. SPARC and PPC patches are untested; they might not work! Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* Merge in new code generator branch.Simon Marlow2011-01-241-19/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | This changes the new code generator to make use of the Hoopl package for dataflow analysis. Hoopl is a new boot package, and is maintained in a separate upstream git repository (as usual, GHC has its own lagging darcs mirror in http://darcs.haskell.org/packages/hoopl). During this merge I squashed recent history into one patch. I tried to rebase, but the history had some internal conflicts of its own which made rebase extremely confusing, so I gave up. The history I squashed was: - Update new codegen to work with latest Hoopl - Add some notes on new code gen to cmm-notes - Enable Hoopl lag package. - Add SPJ note to cmm-notes - Improve GC calls on new code generator. Work in this branch was done by: - Milan Straka <fox@ucw.cz> - John Dias <dias@cs.tufts.edu> - David Terei <davidterei@gmail.com> Edward Z. Yang <ezyang@mit.edu> merged in further changes from GHC HEAD and fixed a few bugs.
* Fix error compiling AsmCodeGen.lhs for PPC Mac (unused makeFar addr)naur@post11.tele.dk2010-12-191-2/+2
|
* Define cTargetArch and start to use it rather than ifdefsIan Lynagh2011-01-041-14/+9
| | | | | | | | Using Haskell conditionals means the compiler sees all the code, so there should be less rot of code specific to uncommon arches. Code for other platforms should still be optimised away, although if we want to support targetting other arches then we'll need to compile it for-real anyway.
* Fix error compiling AsmCodeGen.lhs for PPC Mac (rtsPackageId)naur@post11.tele.dk2010-12-191-0/+1
|
* Fix unused import warning on OS XIan Lynagh2010-10-221-0/+2
|
* Fix warnings in AsmCodeGenDavid Terei2010-10-071-25/+34
|
* NCG: Refactor representation of code with liveness infoBen.Lippmeier@anu.edu.au2009-09-171-1/+3
| | | | | | | | | | | | | * I've pushed the SPILL and RELOAD instrs down into the LiveInstr type to make them easier to work with. * When the graph allocator does a spill cycle it now just re-annotates the LiveCmmTops instead of converting them to NatCmmTops and back. * This saves working out the SCCS again, and avoids rewriting the SPILL and RELOAD meta instructions into real machine instructions.
* Add new LLVM code generator to GHC. (Version 2)David Terei2010-06-151-83/+6
| | | | | | | | | | | | | | | | | | This was done as part of an honours thesis at UNSW, the paper describing the work and results can be found at: http://www.cse.unsw.edu.au/~pls/thesis/davidt-thesis.pdf A Homepage for the backend can be found at: http://hackage.haskell.org/trac/ghc/wiki/Commentary/Compiler/Backends/LLVM Quick summary of performance is that for the 'nofib' benchmark suite, runtimes are within 5% slower than the NCG and generally better than the C code generator. For some code though, such as the DPH projects benchmark, the LLVM code generator outperforms the NCG and C code generator by about a 25% reduction in run times.
* __stg_EAGER_BLACKHOLE_INFO -> __stg_EAGER_BLACKHOLE_info (#4106)Simon Marlow2010-06-021-1/+1
|
* Fix error compiling AsmCodeGen.lhs for PPC Mac (mkRtsCodeLabel)naur@post11.tele.dk2010-04-031-3/+3
| | | | | | | | | | The error messages eliminated are: > compiler/nativeGen/AsmCodeGen.lhs:875:31: > Not in scope: `mkRtsCodeLabel' > compiler/nativeGen/AsmCodeGen.lhs:879:31: > Not in scope: `mkRtsCodeLabel' > compiler/nativeGen/AsmCodeGen.lhs:883:31: > Not in scope: `mkRtsCodeLabel'
* Loop problems in native back ends, update to T3286 fixdias@cs.tufts.edu2009-11-051-4/+12
| | | | | | | | | | | The native back ends had difficulties with loops; in particular the code for branch-chain elimination could run in infinite loops or drop basic blocks. The old codeGen didn't expose these problems. Also, my fix for T3286 in the new codegen was getting applied to too many (some wrong) cases; a better pattern match fixed that.
* Fix a space leak in the native code gen (again)Simon Marlow2009-09-111-5/+5
|
* Remove GHC's haskell98 dependencyIan Lynagh2009-07-241-1/+0
|
* Split Reg into vreg/hreg and add register pairsBen.Lippmeier@anu.edu.au2009-05-181-8/+29
| | | | | | | | | | | | | * The old Reg type is now split into VirtualReg and RealReg. * For the graph coloring allocator, the type of the register graph is now (Graph VirtualReg RegClass RealReg), which shows that it colors in nodes representing virtual regs with colors representing real regs. (as was intended) * RealReg contains two contructors, RealRegSingle and RealRegPair, where RealRegPair is used to represent a SPARC double reg constructed from two single precision FP regs. * On SPARC we can now allocate double regs into an arbitrary register pair, instead of reserving some reg ranges to only hold float/double values.
* SPARC NCG: Fix word size conversionsBen.Lippmeier@anu.edu.au2009-02-171-1/+0
|
* SPARC NCG: Reorganise Reg and RegInfoBen.Lippmeier@anu.edu.au2009-02-161-1/+1
|
* NCG: validate fixes for ppc-darwinBen.Lippmeier@anu.edu.au2009-02-151-6/+5
|
* NCG: Validate fixes for x86-linuxBen.Lippmeier@anu.edu.au2009-02-151-1/+1
|
* NCG: Split up the native code generator into arch specific modulesBen.Lippmeier@anu.edu.au2009-02-151-35/+104
| | | | | | | | | | | | | | | | | | | | | | | | | | | - nativeGen/Instruction defines a type class for a generic instruction set. Each of the instruction sets we have, X86, PPC and SPARC are instances of it. - The register alloctors use this type class when they need info about a certain register or instruction, such as regUsage, mkSpillInstr, mkJumpInstr, patchRegs.. - nativeGen/Platform defines some data types enumerating the architectures and operating systems supported by the native code generator. - DynFlags now keeps track of the current build platform, and the PositionIndependentCode module uses this to decide what to do instead of relying of #ifdefs. - It's not totally retargetable yet. Some info info about the build target is still hardwired, but I've tried to contain most of it to a single module, TargetRegs. - Moved the SPILL and RELOAD instructions into LiveInstr. - Reg and RegClass now have their own modules, and are shared across all architectures.
* NCG: Move RegLiveness -> RegAlloc.LivenessBen.Lippmeier@anu.edu.au2009-02-041-1/+1
|
* NCG: Rename MachRegs, MachInstrs -> Regs, Instrs to reflect arch specific namingBen.Lippmeier@anu.edu.au2009-02-041-2/+2
|
* NCG: Move the graph allocator into its own dirBen.Lippmeier@anu.edu.au2009-02-031-4/+4
|
* NCG: Split linear allocator into separate modules.Ben.Lippmeier@anu.edu.au2009-02-021-1/+3
|
* Optimise writing out the .s fileSimon Marlow2009-02-021-3/+8
| | | | | | | | | | | | | | | I noticed while working on the new IO library that GHC was writing out the .s file in lots of little chunks. It turns out that this is a result of using multiple printDocs to avoid space leaks in the NCG, where each printDoc is finishing up with an hFlush. What's worse, is that this makes poor use of the optimisation inside printDoc that uses its own buffering to avoid hitting the Handle all the time. So I hacked around this by making the buffering optimisation inside Pretty visible from the outside, for use in the NCG. The changes are quite small.
* Merging in the new codegen branchdias@eecs.harvard.edu2008-08-141-8/+7
| | | | | | | | | | | | | | | | | | This merge does not turn on the new codegen (which only compiles a select few programs at this point), but it does introduce some changes to the old code generator. The high bits: 1. The Rep Swamp patch is finally here. The highlight is that the representation of types at the machine level has changed. Consequently, this patch contains updates across several back ends. 2. The new Stg -> Cmm path is here, although it appears to have a fair number of bugs lurking. 3. Many improvements along the CmmCPSZ path, including: o stack layout o some code for infotables, half of which is right and half wrong o proc-point splitting
* Add optional eager black-holing, with new flag -feager-blackholingSimon Marlow2008-11-181-0/+4
| | | | | | | | | | | | | | | Eager blackholing can improve parallel performance by reducing the chances that two threads perform the same computation. However, it has a cost: one extra memory write per thunk entry. To get the best results, any code which may be executed in parallel should be compiled with eager blackholing turned on. But since there's a cost for sequential code, we make it optional and turn it on for the parallel package only. It might be a good idea to compile applications (or modules) with parallel code in with -feager-blackholing. ToDo: document -feager-blackholing.
* Fix to i386_insert_ffrees (#2724, #1944)Simon Marlow2008-11-111-4/+1
| | | | | | | | | | The i386 native code generator has to arrange that the FPU stack is clear on exit from any function that uses the FPU. Unfortunately it was getting this wrong (and has been ever since this code was written, I think): it was looking for basic blocks that used the FPU and adding the code to clear the FPU stack on any non-local exit from the block. In fact it should be doing this on a whole-function basis, rather than individual basic blocks.
* Follow Digraph changes in AsmCodeGenMax Bolingbroke2008-07-311-1/+1
|
* replace Cmm 'hint' with 'kind'Norman Ramsey2008-05-031-2/+2
| | | | | | C-- no longer has 'hints'; to guide parameter passing, it has 'kinds'. Renamed type constructor, data constructor, and record fields accordingly
* Change the last few (F)SLIT's into (f)sLit'sIan Lynagh2008-04-221-2/+2
|
* Make some more modules use LazyUniqFM instead of UniqFMIan Lynagh2008-02-071-1/+1
| | | | | If these modules use UniqFM then we get a stack overflow when compiling modules that use fundeps. I haven't tracked down the actual cause.
* Make some more modules use LazyUniqFM instead of UniqFMIan Lynagh2008-02-071-1/+1
| | | | | If these modules use UniqFM then we get a stack overflow when compiling modules that use fundeps. I haven't tracked down the actual cause.
* change CmmActual, CmmFormal to use a data CmmHinted rather than tuple (#1405)Isaac Dupree2008-01-041-2/+2
| | | | | | | This allows the instance of UserOfLocalRegs to be within Haskell98, and IMHO makes the code a little cleaner generally. This is one small (though tedious) step towards making GHC's code more portable...
* Count CmmTops processed so far in the native code generatorBen.Lippmeier@anu.edu.au2007-09-141-6/+13
| | | | To help with debugging / nicer -ddump-asm-regalloc-stages
* Add iterative coalescing to graph coloring allocatorBen.Lippmeier@anu.edu.au2007-09-071-18/+12
| | | | | | | | | | | | | | | | | Iterative coalescing interleaves conservative coalesing with the regular simplify/scan passes. This increases the chance that nodes will be coalesced as they will have a lower degree than at the beginning of simplify. The end result is that more register to register moves will be eliminated in the output code, though the iterative nature of the algorithm makes it slower compared to non-iterative coloring. Use -fregs-iterative for graph coloring allocation with iterative coalescing -fregs-graph for non-iterative coalescing. The plan is for iterative coalescing to be enabled with -O2 and have a quicker, non-iterative algorithm otherwise. The time/benefit tradeoff between iterative and not is still being tuned - optimal graph coloring is NP-hard, afterall..
* massive changes to add a 'zipper' representation of C--Norman Ramsey2007-09-061-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Changes too numerous to comment on, but here is some old history that I saved: Wed Aug 15 11:07:13 BST 2007 Norman Ramsey <nr@eecs.harvard.edu> * type synonyms made consistent with new Cmm types M ./compiler/nativeGen/MachInstrs.hs -2 +2 Mon Aug 20 19:22:14 BST 2007 Norman Ramsey <nr@eecs.harvard.edu> * pushing return info beyond cmm into codegen M ./compiler/codeGen/Bitmap.hs r3 M ./compiler/codeGen/CgBindery.lhs r3 M ./compiler/codeGen/CgCallConv.hs r3 M ./compiler/codeGen/CgCase.lhs r3 M ./compiler/codeGen/CgClosure.lhs r3 M ./compiler/codeGen/CgCon.lhs r3 M ./compiler/codeGen/CgExpr.lhs r3 M ./compiler/codeGen/CgForeignCall.hs -6 +7 r3 M ./compiler/codeGen/CgHeapery.lhs r3 M ./compiler/codeGen/CgHpc.hs +1 r3 M ./compiler/codeGen/CgInfoTbls.hs r3 M ./compiler/codeGen/CgLetNoEscape.lhs r3 M ./compiler/codeGen/CgMonad.lhs r3 M ./compiler/codeGen/CgParallel.hs r3 M ./compiler/codeGen/CgPrimOp.hs +3 r3 M ./compiler/codeGen/CgProf.hs r3 M ./compiler/codeGen/CgStackery.lhs r3 M ./compiler/codeGen/CgTailCall.lhs r3 M ./compiler/codeGen/CgTicky.hs r3 M ./compiler/codeGen/CgUtils.hs -1 +1 r3 M ./compiler/codeGen/ClosureInfo.lhs r3 M ./compiler/codeGen/CodeGen.lhs r3 M ./compiler/codeGen/SMRep.lhs r3 M ./compiler/nativeGen/AsmCodeGen.lhs -2 +2 r1 M ./compiler/nativeGen/MachCodeGen.hs -3 +3 r1 M ./compiler/nativeGen/MachInstrs.hs r1 M ./compiler/nativeGen/MachRegs.lhs r1 M ./compiler/nativeGen/NCGMonad.hs r1 M ./compiler/nativeGen/PositionIndependentCode.hs r1 M ./compiler/nativeGen/PprMach.hs r1 M ./compiler/nativeGen/RegAllocInfo.hs r1 M ./compiler/nativeGen/RegisterAlloc.hs r1 Mon Aug 20 20:54:41 BST 2007 Norman Ramsey <nr@eecs.harvard.edu> * put CmmReturnInfo into a CmmCall (and related types) M ./compiler/cmm/Cmm.hs -2 +1 r3 M ./compiler/cmm/CmmBrokenBlock.hs -13 +12 r1 M ./compiler/cmm/CmmCPS.hs -3 +3 M ./compiler/cmm/CmmCPSGen.hs -8 +6 r1 M ./compiler/cmm/CmmLint.hs -1 +1 M ./compiler/cmm/CmmLive.hs -1 +1 M ./compiler/cmm/CmmOpt.hs -3 +3 M ./compiler/cmm/CmmParse.y -6 +6 r3 M ./compiler/cmm/PprC.hs -3 +3 M ./compiler/cmm/PprCmm.hs -7 +4 r2 M ./compiler/codeGen/CgForeignCall.hs -7 +6 r2 M ./compiler/codeGen/CgHpc.hs -1 r1 M ./compiler/codeGen/CgPrimOp.hs -3 r1 M ./compiler/codeGen/CgUtils.hs -1 +1 r1 M ./compiler/nativeGen/AsmCodeGen.lhs -2 +2 M ./compiler/nativeGen/MachCodeGen.hs -3 +3 r1 Tue Aug 21 18:09:13 BST 2007 Norman Ramsey <nr@eecs.harvard.edu> * add call info in nativeGen M ./compiler/nativeGen/AsmCodeGen.lhs r1 M ./compiler/nativeGen/MachInstrs.hs r1 M ./compiler/nativeGen/MachRegs.lhs r1 M ./compiler/nativeGen/NCGMonad.hs r1 M ./compiler/nativeGen/PositionIndependentCode.hs r1 M ./compiler/nativeGen/PprMach.hs r1 M ./compiler/nativeGen/RegAllocInfo.hs r1 Wed Aug 22 16:41:58 BST 2007 Norman Ramsey <nr@eecs.harvard.edu> * ListGraph is now a newtype, not a synonym The resultant bookkeepping is unenviable, but the change greatly simplifies our ability to make Cmm things propertly Outputable for both list-graph and zipper-graph representations. M ./compiler/cmm/Cmm.hs -5 +3 M ./compiler/cmm/CmmCPS.hs -2 +2 M ./compiler/cmm/CmmCPSGen.hs -1 +1 M ./compiler/cmm/CmmContFlowOpt.hs -3 +3 M ./compiler/cmm/CmmCvt.hs -2 +2 M ./compiler/cmm/CmmInfo.hs -2 +3 M ./compiler/cmm/CmmLint.hs -1 +1 M ./compiler/cmm/CmmOpt.hs -2 +2 M ./compiler/cmm/PprC.hs -1 +1 M ./compiler/cmm/PprCmm.hs -5 +8 M ./compiler/cmm/PprCmmZ.hs -7 +1 M ./compiler/codeGen/CgMonad.lhs -1 +1 M ./compiler/nativeGen/AsmCodeGen.lhs -15 +15 M ./compiler/nativeGen/MachCodeGen.hs -2 +2 M ./compiler/nativeGen/PositionIndependentCode.hs -6 +6 M ./compiler/nativeGen/PprMach.hs -3 +2 M ./compiler/nativeGen/RegAllocColor.hs +1 M ./compiler/nativeGen/RegAllocLinear.hs -4 +5 M ./compiler/nativeGen/RegCoalesce.hs -6 +6 M ./compiler/nativeGen/RegLiveness.hs -12 +12 Thu Aug 23 13:44:49 BST 2007 Norman Ramsey <nr@eecs.harvard.edu> * diagnostic assistance in case fromJust fails M ./compiler/nativeGen/MachCodeGen.hs -2 +5 Thu Aug 23 14:07:28 BST 2007 Norman Ramsey <nr@eecs.harvard.edu> * give every block, even the first, a label With branch-chain elimination, the first block of a procedure might be the target of a branch. This actually happens to a dozen or more procedures in the run-time system. M ./compiler/nativeGen/PprMach.hs -8 +3 Fri Aug 24 17:27:04 BST 2007 Norman Ramsey <nr@eecs.harvard.edu> * clean up the code in PprMach M ./compiler/nativeGen/PprMach.hs -16 +14 Fri Aug 24 19:35:03 BST 2007 Norman Ramsey <nr@eecs.harvard.edu> * a bunch of impedance matching to get the compiler to build, plus * the plus is diagnostics for unreachable code, which required moving a lot of prettyprinting code M ./compiler/cmm/Cmm.hs -7 +5 M ./compiler/cmm/CmmCPSZ.hs -1 +1 M ./compiler/cmm/CmmCvt.hs -8 +8 M ./compiler/cmm/CmmParse.y -4 +3 M ./compiler/cmm/MkZipCfg.hs -19 +9 M ./compiler/cmm/PprCmmZ.hs -118 +4 M ./compiler/cmm/ZipCfg.hs -1 +13 M ./compiler/cmm/ZipCfgCmm.hs -10 +129 M ./compiler/main/HscMain.lhs -4 +4 M ./compiler/nativeGen/NCGMonad.hs -2 +2 M ./compiler/nativeGen/RegAllocInfo.hs -3 +3 Fri Aug 31 14:38:02 BST 2007 Norman Ramsey <nr@eecs.harvard.edu> * fix a warning about an import M ./compiler/nativeGen/RegAllocColor.hs -1 +1
* Improve GraphColor.colorScanBen.Lippmeier@anu.edu.au2007-09-051-2/+2
| | | | | | | | | | | | | | | | | | Testing whether a node in the conflict graph is trivially colorable (triv) is still a somewhat expensive operation. When we find a triv node during scanning, even though we remove it and its edges from the graph, this is unlikely to to make the nodes we've just scanned become triv - so there's not much point re-scanning them right away. Scanning now takes place in passes. We scan the whole graph for triv nodes and remove all the ones found in a batch before rescanning old nodes. Register allocation for SHA1.lhs now takes (just) 40% of total compile time with -O2 -fregs-graph on x86
* Refactor MachRegs.trivColorable to do unboxed accumulationBen.Lippmeier@anu.edu.au2007-09-051-2/+2
| | | | | | | | | trivColorable was soaking up total 31% time, 41% alloc when compiling SHA1.lhs with -O2 -fregs-graph on x86. Refactoring to use unboxed accumulators and walk directly over the UniqFM holding the set of conflicts reduces this to 17% time, 6% alloc.
* change of representation for GenCmm, GenCmmTop, CmmProcNorman Ramsey2007-09-051-15/+15
| | | | | | | | | The type parameter to a C-- procedure now represents a control-flow graph, not a single instruction. The newtype ListGraph preserves the current representation while enabling other representations and a sensible way of prettyprinting. Except for a few changes in the prettyprinter the new compiler binary should be bit-for-bit identical to the old.