summaryrefslogtreecommitdiff
path: root/compiler/codeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* Remove redundant import, revealed by the fix to #7963Simon Peyton Jones2013-06-181-1/+0
|
* Revert "Add support for byte endian swapping for Word 16/32/64."Simon Peyton Jones2013-06-111-12/+0
| | | | This reverts commit 1c5b0511a89488f5280523569d45ee61c0d09ffa.
* Add support for byte endian swapping for Word 16/32/64.Ian Lynagh2013-06-091-0/+12
| | | | | | | | | | | | * Exposes bSwap{,16,32,64}# primops * Add a new machops MO_BSwap * Use a Stg implementation (hs_bswap{16,32,64}) for other implementation in NCG. * Generate bswap in X86 NCG for 32 and 64 bits, and for 16 bits, bswap+shr instead of using xchg. * Generate llvm.bswap intrinsics in llvm codegen. Patch from Vincent Hanquez.
* Wibbles (merg-os) to ticky-tickySimon Peyton Jones2013-06-062-3/+3
|
* Implement cardinality analysisSimon Peyton Jones2013-06-062-18/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This major patch implements the cardinality analysis described in our paper "Higher order cardinality analysis". It is joint work with Ilya Sergey and Dimitrios Vytiniotis. The basic is augment the absence-analysis part of the demand analyser so that it can tell when something is used never at most once some other way The "at most once" information is used a) to enable transformations, and in particular to identify one-shot lambdas b) to allow updates on thunks to be omitted. There are two new flags, mainly there so you can do performance comparisons: -fkill-absence stops GHC doing absence analysis at all -fkill-one-shot stops GHC spotting one-shot lambdas and single-entry thunks The big changes are: * The Demand type is substantially refactored. In particular the UseDmd is factored as follows data UseDmd = UCall Count UseDmd | UProd [MaybeUsed] | UHead | Used data MaybeUsed = Abs | Use Count UseDmd data Count = One | Many Notice that UCall recurses straight to UseDmd, whereas UProd goes via MaybeUsed. The "Count" embodies the "at most once" or "many" idea. * The demand analyser itself was refactored a lot * The previously ad-hoc stuff in the occurrence analyser for foldr and build goes away entirely. Before if we had build (\cn -> ...x... ) then the "\cn" was hackily made one-shot (by spotting 'build' as special. That's essential to allow x to be inlined. Now the occurrence analyser propagates info gotten from 'build's stricness signature (so build isn't special); and that strictness sig is in turn derived entirely automatically. Much nicer! * The ticky stuff is improved to count single-entry thunks separately. One shortcoming is that there is no DEBUG way to spot if an allegedly-single-entry thunk is acually entered more than once. It would not be hard to generate a bit of code to check for this, and it would be reassuring. But it's fiddly and I have not done it. Despite all this fuss, the performance numbers are rather under-whelming. See the paper for more discussion. nucleic2 -0.8% -10.9% 0.10 0.10 +0.0% sphere -0.7% -1.5% 0.08 0.08 +0.0% -------------------------------------------------------------------------------- Min -4.7% -10.9% -9.3% -9.3% -50.0% Max -0.4% +0.5% +2.2% +2.3% +7.4% Geometric Mean -0.8% -0.2% -1.3% -1.3% -1.8% I don't quite know how much credence to place in the runtime changes, but movement seems generally in the right direction.
* Comments and white space onlySimon Peyton Jones2013-06-061-2/+2
|
* Fix the GHC package DLL-splittingIan Lynagh2013-05-141-1/+2
| | | | | | | There's now an internal -dll-split flag, which we use to tell GHC how the GHC package is split into 2 separate DLLs. This is used by Packages.isDllName to determine whether a call is within the same DLL, or whether it is a call to another DLL.
* extended ticky to also track "let"s that are not conventional closuresNicolas Frisby2013-05-026-47/+71
| | | | | | | This includes selector, ap, and constructor thunks. They are still guarded by the -ticky-dyn-thk flag. (This is 024df664b600a with a small bug fix.)
* In CMM, only allow foreign calls to labels, not arbitrary expressionsIan Lynagh2013-04-243-10/+8
| | | | | | | | | I'm not sure if we want to make this change permanently, but for now it fixes the unreg build. I've also removed some redundant special-case code that generated prototypes for foreign functions. The standard pprTempAndExternDecls now generates them.
* Small refactoring in StgCmmExtCodeIan Lynagh2013-04-231-6/+7
|
* Don't duplicate decls unnecessarily in the environmentIan Lynagh2013-04-231-1/+1
| | | | | In loopDecls, as far as I can see the globalDecls will always already be in the environment, so don't add them again.
* Make CmmParse abstractIan Lynagh2013-04-231-1/+1
|
* Revert "extended ticky to also track "let"s that are not closures"Nicolas Frisby2013-04-126-69/+47
| | | | | | This reverts commit 024df664b600a622cb8189ccf31789688505fc1c. Of course I gaff on my last day...
* extended ticky to also track "let"s that are not closuresNicolas Frisby2013-04-126-47/+69
| | | | | This includes selector, ap, and constructor thunks. They are still guarded by the -ticky-dyn-thk flag.
* added ticky counters for heap and stack checksNicolas Frisby2013-04-112-1/+11
|
* ticky enhancementsNicolas Frisby2013-03-299-348/+614
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * the new StgCmmArgRep module breaks a dependency cycle; I also untabified it, but made no real changes * updated the documentation in the wiki and change the user guide to point there * moved the allocation enters for ticky and CCS to after the heap check * I left LDV where it was, which was before the heap check at least once, since I have no idea what it is * standardized all (active?) ticky alloc totals to bytes * in order to avoid double counting StgCmmLayout.adjustHpBackwards no longer bumps ALLOC_HEAP_ctr * I resurrected the SLOW_CALL counters * the new module StgCmmArgRep breaks cyclic dependency between Layout and Ticky (which the SLOW_CALL counters cause) * renamed them SLOW_CALL_fast_<pattern> and VERY_SLOW_CALL * added ALLOC_RTS_ctr and _tot ticky counters * eg allocation by Storage.c:allocate or a BUILD_PAP in stg_ap_*_info * resurrected ticky counters for ALLOC_THK, ALLOC_PAP, and ALLOC_PRIM * added -ticky and -DTICKY_TICKY in ways.mk for debug ways * added a ticky counter for total LNE entries * new flags for ticky: -ticky-allocd -ticky-dyn-thunk -ticky-LNE * all off by default * -ticky-allocd: tracks allocation *of* closure in addition to allocation *by* that closure * -ticky-dyn-thunk tracks dynamic thunks as if they were functions * -ticky-LNE tracks LNEs as if they were functions * updated the ticky report format, including making the argument categories (more?) accurate again * the printed name for things in the report include the unique of their ticky parent as well as if they are not top-level
* Typo-fix for panic.Edward Z. Yang2013-03-111-1/+1
| | | | Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* Remove unnecessary DynFlags arg to mkCgIdInfoSimon Peyton Jones2013-03-091-4/+4
|
* Remove stale, commented-out code about heap checksSimon Peyton Jones2013-03-091-83/+0
|
* Remove unused functions cmmConstrTag, cmmGetTagSimon Peyton Jones2013-03-091-2/+2
| | | | | Patch offered by Boris Sukholitko <boriss@gmail.com> Trac #7757
* Remove cg_tag from CgIdInfoBoris Sukholitko2013-03-092-7/+3
|
* Detabify StgCmmEnvBoris Sukholitko2013-03-091-63/+55
|
* Detabify StgCmmMonadBoris Sukholitko2013-03-091-175/+168
|
* Satisfy the invariant on CmmUnsafeForeignCall argumentsSimon Marlow2013-03-061-30/+23
| | | | | | | There was potentially a bug here, but no actual failures were identified in the wild. See Note [Register Parameter Passing]
* Primitive bitwise operations on Int# (Fixes #7689)Jan Stolarek2013-02-181-0/+4
|
* some more typosGabor Greif2013-02-021-1/+1
|
* Add prefetch primops.Geoffrey Mainland2013-02-011-0/+47
|
* Add support for passing SSE vectors in registers.Geoffrey Mainland2013-02-012-5/+20
| | | | | | | This patch adds support for 6 XMM registers on x86-64 which overlap with the F and D registers and may hold 128-bit wide SIMD vectors. Because there is not a good way to attach type information to STG registers, we aggressively bitcast in the LLVM back-end.
* Add the Int64X2# primitive type and associated primops.Geoffrey Mainland2013-02-011-0/+37
|
* Add the DoubleX2# primitive type and associated primops.Geoffrey Mainland2013-02-011-0/+36
|
* Add the Int32X4# primitive type and associated primops.Paul Monday2013-02-011-0/+37
|
* Add the Float32X4# primitive type and associated primops.Geoffrey Mainland2013-02-011-137/+337
| | | | | | | | | | | | | This patch lays the groundwork needed for primop support for SIMD vectors. In addition to the groundwork, we add support for the FloatX4# primitive type and associated primops. * Add the FloatX4# primitive type and associated primops. * Add CodeGen support for Float vectors. * Compile vector operations to LLVM vector operations in the LLVM code generator. * Make the x86 native backend fail gracefully when encountering vector primops. * Only generate primop wrappers for vector primops when using LLVM.
* Always pass vector values on the stack.Geoffrey Mainland2013-02-011-28/+36
| | | | | Vector values are now always passed on the stack. This isn't particularly efficient, but it will have to do for now.
* Tidy up: move info-table related stuff to CmmInfoSimon Marlow2013-01-234-121/+4
| | | | Prep for #709
* White space onlySimon Peyton Jones2013-01-151-1/+1
|
* Inline some FastBytes/ByteString wrappersIan Lynagh2012-12-141-1/+2
| | | | Working towards removing FastBytes
* Implement word2Float# and word2Double#Johan Tibell2012-12-131-0/+6
|
* Code-size optimisation for top-level indirections (#7308)Simon Marlow2012-11-194-19/+48
| | | | | | | | | | | | | | | Top-level indirections are often generated when there is a cast, e.g. foo :: T foo = bar `cast` (some coercion) For these we were generating a full-blown CAF, which is a fair chunk of code. This patch makes these indirections generate a single IND_STATIC closure (4 words) instead. This is exactly what the CAF would evaluate to eventually anyway, we're just shortcutting the whole process.
* Fix the Slow calling convention (#7192)Simon Marlow2012-11-134-21/+18
| | | | | | | | The Slow calling convention passes the closure in R1, but we were ignoring this and hoping it would work, which it often did. However, this bug seems to have been the cause of #7192, because the graph-colouring allocator is more sensitive to having correct liveness information on jumps.
* Remove OldCmm, convert backends to consume new CmmSimon Marlow2012-11-121-59/+28
| | | | | | | | | | | | | | | | | | This removes the OldCmm data type and the CmmCvt pass that converts new Cmm to OldCmm. The backends (NCGs, LLVM and C) have all been converted to consume new Cmm. The main difference between the two data types is that conditional branches in new Cmm have both true/false successors, whereas in OldCmm the false case was a fallthrough. To generate slightly better code we occasionally need to invert a conditional to ensure that the branch-not-taken becomes a fallthrough; this was previously done in CmmCvt, and it is now done in CmmContFlowOpt. We could go further and use the Hoopl Block representation for native code, which would mean that we could use Hoopl's postorderDfs and analyses for native code, but for now I've left it as is, using the old ListGraph representation for native code.
* loadThreadState should set HpAlloc=0Simon Marlow2012-11-051-1/+7
|
* Fix popcnt callsIan Lynagh2012-11-011-10/+5
| | | | | We don't want to narrow the argument size before making the foreign call: Word8 still gets passed as a Word-sized argument
* Whitespace only in codeGen/StgCmmPrim.hsIan Lynagh2012-11-011-90/+83
|
* Draw STG F and D registers from the same pool of available SSE registers on ↵Geoffrey Mainland2012-10-301-2/+8
| | | | | | | | | | | | x86-64. On x86-64 F and D registers are both drawn from SSE registers, so there is no reason not to draw them from the same pool of available SSE registers. This means that whereas previously a function could only receive two Double arguments in registers even if it did not have any Float arguments, now it can receive up to 6 arguments that are any mix of Float and Double in registers. This patch breaks the LLVM back end. The next patch will fix this breakage.
* Attach global register liveness info to Cmm procedures.Geoffrey Mainland2012-10-306-17/+17
| | | | | | | All Cmm procedures now include the set of global registers that are live on procedure entry, i.e., the global registers used to pass arguments to the procedure. Only global registers that are use to pass arguments are included in this list.
* Remove the old codegenSimon Marlow2012-10-1925-10495/+13
| | | | | Except for CgUtils.fixStgRegisters that is used in the NCG and LLVM backends, and should probably be moved somewhere else.
* Some alpha renamingIan Lynagh2012-10-1624-53/+53
| | | | | Mostly d -> g (matching DynFlag -> GeneralFlag). Also renamed if* to when*, matching the Haskell if/when names
* Fix copyArray# bug in new code generatorRoman Leshchinskiy2012-10-081-17/+22
|
* Fix copyArray# bug in old code generatorRoman Leshchinskiy2012-10-081-16/+19
|
* expand tabsSimon Marlow2012-10-081-58/+58
|