summaryrefslogtreecommitdiff
path: root/compiler/cmm/CmmNode.hs
Commit message (Collapse)AuthorAgeFilesLines
* Module hierarchy: Cmm (cf #13009)Sylvain Henry2020-01-251-724/+0
|
* Fix typos, via a Levenshtein-style correctorBrian Wignall2020-01-041-1/+1
|
* Fix typos, using Wikipedia list of common typosBrian Wignall2019-11-281-1/+1
|
* Module hierarchy: StgToCmm (#13009)Sylvain Henry2019-09-101-4/+4
| | | | | | Add StgToCmm module hierarchy. Platform modules that are used in several other places (NCG, LLVM codegen, Cmm transformations) are put into GHC.Platform.
* Fix a Note name in CmmNodeÖmer Sinan Ağacan2019-06-191-1/+1
| | | | | | ("Continuation BlockIds" is referenced in CmmProcPoint) [skip ci]
* NCG: New code layout algorithm.Andreas Klebinger2018-11-171-1/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch implements a new code layout algorithm. It has been tested for x86 and is disabled on other platforms. Performance varies slightly be CPU/Machine but in general seems to be better by around 2%. Nofib shows only small differences of about +/- ~0.5% overall depending on flags/machine performance in other benchmarks improved significantly. Other benchmarks includes at least the benchmarks of: aeson, vector, megaparsec, attoparsec, containers, text and xeno. While the magnitude of gains differed three different CPUs where tested with all getting faster although to differing degrees. I tested: Sandy Bridge(Xeon), Haswell, Skylake * Library benchmark results summarized: * containers: ~1.5% faster * aeson: ~2% faster * megaparsec: ~2-5% faster * xml library benchmarks: 0.2%-1.1% faster * vector-benchmarks: 1-4% faster * text: 5.5% faster On average GHC compile times go down, as GHC compiled with the new layout is faster than the overhead introduced by using the new layout algorithm, Things this patch does: * Move code responsilbe for block layout in it's own module. * Move the NcgImpl Class into the NCGMonad module. * Extract a control flow graph from the input cmm. * Update this cfg to keep it in sync with changes during asm codegen. This has been tested on x64 but should work on x86. Other platforms still use the old codelayout. * Assign weights to the edges in the CFG based on type and limited static analysis which are then used for block layout. * Once we have the final code layout eliminate some redundant jumps. In particular turn a sequences of: jne .foo jmp .bar foo: into je bar foo: .. Test Plan: ci Reviewers: bgamari, jmct, jrtc27, simonmar, simonpj, RyanGlScott Reviewed By: RyanGlScott Subscribers: RyanGlScott, trommler, jmct, carter, thomie, rwbarton GHC Trac Issues: #15124 Differential Revision: https://phabricator.haskell.org/D4726
* compiler: introduce custom "GhcPrelude" PreludeHerbert Valerio Riedel2017-09-191-1/+2
| | | | | | | | | | | | | | | | | | This switches the compiler/ component to get compiled with -XNoImplicitPrelude and a `import GhcPrelude` is inserted in all modules. This is motivated by the upcoming "Prelude" re-export of `Semigroup((<>))` which would cause lots of name clashes in every modulewhich imports also `Outputable` Reviewers: austin, goldfire, bgamari, alanz, simonmar Reviewed By: bgamari Subscribers: goldfire, rwbarton, thomie, mpickering, bgamari Differential Revision: https://phabricator.haskell.org/D3989
* Hoopl: remove dependency on Hoopl packageMichal Terepeta2017-06-231-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This copies the subset of Hoopl's functionality needed by GHC to `cmm/Hoopl` and removes the dependency on the Hoopl package. The main motivation for this change is the confusing/noisy interface between GHC and Hoopl: - Hoopl has `Label` which is GHC's `BlockId` but different than GHC's `CLabel` - Hoopl has `Unique` which is different than GHC's `Unique` - Hoopl has `Unique{Map,Set}` which are different than GHC's `Uniq{FM,Set}` - GHC has its own specialized copy of `Dataflow`, so `cmm/Hoopl` is needed just to filter the exposed functions (filter out some of the Hoopl's and add the GHC ones) With this change, we'll be able to simplify this significantly. It'll also be much easier to do invasive changes (Hoopl is a public package on Hackage with users that depend on the current behavior) This should introduce no changes in functionality - it merely copies the relevant code. Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: ./validate Reviewers: austin, bgamari, simonmar Reviewed By: bgamari, simonmar Subscribers: simonpj, kavon, rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D3616
* Cmm: Add support for undefined unwinding statementsBen Gamari2017-02-081-4/+4
| | | | | | | | | | | | | | And use to mark `stg_stack_underflow_frame`, which we are unable to determine a caller from. To simplify parsing at the moment we steal the `return` keyword to indicate an undefined unwind value. Perhaps this should be revisited. Reviewers: scpmw, simonmar, austin, erikd Subscribers: dfeuer, thomie Differential Revision: https://phabricator.haskell.org/D2738
* Generalize CmmUnwind and pass unwind information through NCGBen Gamari2017-02-081-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As discussed in D1532, Trac Trac #11337, and Trac Trac #11338, the stack unwinding information produced by GHC is currently quite approximate. Essentially we assume that register values do not change at all within a basic block. While this is somewhat true in normal Haskell code, blocks containing foreign calls often break this assumption. This results in unreliable call stacks, especially in the code containing foreign calls. This is worse than it sounds as unreliable unwinding information can at times result in segmentation faults. This patch set attempts to improve this situation by tracking unwinding information with finer granularity. By dispensing with the assumption of one unwinding table per block, we allow the compiler to accurately represent the areas surrounding foreign calls. Towards this end we generalize the representation of unwind information in the backend in three ways, * Multiple CmmUnwind nodes can occur per block * CmmUnwind nodes can now carry unwind information for multiple registers (while not strictly necessary; this makes emitting unwinding information a bit more convenient in the compiler) * The NCG backend is given an opportunity to modify the unwinding records since it may need to make adjustments due to, for instance, native calling convention requirements for foreign calls (see #11353). This sets the stage for resolving #11337 and #11338. Test Plan: Validate Reviewers: scpmw, simonmar, austin, erikd Subscribers: qnikst, thomie Differential Revision: https://phabricator.haskell.org/D2741
* Minor cleanup of foldRegs{Used,Defd}Michal Terepeta2016-11-291-18/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This makes the two functions strict in the accumulator - it seems that there are only two users of those functions: `CmmLive` and `CmmSink` and in both cases the strict fold fits better. The commit also removes a few unused functions (`filterRegsUsed`), instances (for `Maybe` and `RegSet`) and gets rid of unnecessary inculde of `HsVersions.h`. The performance effect of avoiding unnecessary thunks is mostly negligible, although we do allocate a tiny bit less (nofib's section on compile allocations): ``` -1 s.d. ----- -0.2% +1 s.d. ----- -0.1% Average ----- -0.2% ``` Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: validate Reviewers: simonmar, austin, bgamari Reviewed By: bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2723
* Delete Ord UniqueBartosz Nitka2016-06-301-6/+16
| | | | | | | | | | | | | | | | | | | | | | Ord Unique can be a source of invisible, accidental nondeterminism as explained in Note [No Ord for Unique]. This removes it, leaving a note with rationale. It's unfortunate that I had to write Ord instances for codegen data structures by hand, but I believe that it's a right trade-off here. Test Plan: ./validate Reviewers: simonmar, austin, bgamari Reviewed By: simonmar Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2370 GHC Trac Issues: #4012
* CmmNode: Make CmmTickScope's Unique strictBen Gamari2016-06-181-1/+1
| | | | | There is no reason why we need laziness here and making it strict enables unpacking.
* Warn about simplifiable class constraintsSimon Peyton Jones2016-04-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provoked by Trac #11948, this patch adds a new warning to GHC -Wsimplifiable-class-constraints It warns if you write a class constraint in a type signature that can be simplified by an existing instance declaration. Almost always this means you should simplify it right now; type inference is very fragile without it, as #11948 shows. I've put the warning as on-by-default, but I suppose that if there are howls of protest we can move it out (as happened for -Wredundant-constraints. It actually found an example of an over-complicated context in CmmNode. Quite a few tests use these weird contexts to trigger something else, so I had to suppress the warning in those. The 'haskeline' library has a few occurrences of the warning (which I think should be fixed), so I switched it off for that library in warnings.mk. The warning itself is done in TcValidity.check_class_pred. HOWEVER, when type inference fails we get a type error; and the error suppresses the (informative) warning. So as things stand, the warning only happens when it doesn't cause a problem. Not sure what to do about this, but this patch takes us forward, I think.
* Annotate CmmBranch with an optional likely targetSimon Marlow2015-09-231-9/+11
| | | | | | | | | | | | | | | | | Summary: This allows the code generator to give hints to later code generation steps about which branch is most likely to be taken. Right now it is only taken into account in one place: a special case in CmmContFlowOpt that swapped branches over to maximise the chance of fallthrough, which is now disabled when there is a likelihood setting. Test Plan: validate Reviewers: austin, simonpj, bgamari, ezyang, tibbe Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1273
* Refactor the story around switches (#10137)Joachim Breitner2015-03-301-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This re-implements the code generation for case expressions at the Stg → Cmm level, both for data type cases as well as for integral literal cases. (Cases on float are still treated as before). The goal is to allow for fancier strategies in implementing them, for a cleaner separation of the strategy from the gritty details of Cmm, and to run this later than the Common Block Optimization, allowing for one way to attack #10124. The new module CmmSwitch contains a number of notes explaining this changes. For example, it creates larger consecutive jump tables than the previous code, if possible. nofib shows little significant overall improvement of runtime. The rather large wobbling comes from changes in the code block order (see #8082, not much we can do about it). But the decrease in code size alone makes this worthwhile. ``` Program Size Allocs Runtime Elapsed TotalMem Min -1.8% 0.0% -6.1% -6.1% -2.9% Max -0.7% +0.0% +5.6% +5.7% +7.8% Geometric Mean -1.4% -0.0% -0.3% -0.3% +0.0% ``` Compilation time increases slightly: ``` -1 s.d. ----- -2.0% +1 s.d. ----- +2.5% Average ----- +0.3% ``` The test case T783 regresses a lot, but it is the only one exhibiting any regression. The cause is the changed order of branches in an if-then-else tree, which makes the hoople data flow analysis traverse the blocks in a suboptimal order. Reverting that gets rid of this regression, but has a consistent, if only very small (+0.2%), negative effect on runtime. So I conclude that this test is an extreme outlier and no reason to change the code. Differential Revision: https://phabricator.haskell.org/D720
* Eliminate so-called "silent superclass parameters"Simon Peyton Jones2014-12-231-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The purpose of silent superclass parameters was to solve the awkward problem of superclass dictinaries being bound to bottom. See THE PROBLEM in Note [Recursive superclasses] in TcInstDcls Although the silent-superclass idea worked, * It had non-local consequences, and had effects even in Haddock, where we had to discard silent parameters before displaying instance declarations * It had unexpected peformance costs, shown up by Trac #3064 and its test case. In monad-transformer code, when constructing a Monad dictionary you had to pass an Applicative dictionary; and to construct that you neede a Functor dictionary. Yet these extra dictionaries were often never used. (All this got much worse when we added Applicative as a superclass of Monad.) Test T3064 compiled *far* faster after silent superclasses were eliminated. * It introduced new bugs. For example SilentParametersOverlapping, T5051, and T7862, all failed to compile because of instance overlap directly because of the silent-superclass trick. So this patch takes a new approach, which I worked out with Dimitrios in the closing hours before Christmas. It is described in detail in THE PROBLEM in Note [Recursive superclasses] in TcInstDcls. Seems to work great! Quite a bit of knock-on effect * The main implementation work is in tcSuperClasses in TcInstDcls Everything else is fall-out * IdInfo.DFunId no longer needs its n-silent argument * Ditto IDFunId in IfaceSyn * Hence interface file format changes * Now that DFunIds do not have silent superclass parameters, printing out instance declarations is simpler. There is tiny knock-on effect in Haddock, so that submodule is updated * I realised that when computing the "size of a dictionary type" in TcValidity.sizePred, we should be rather conservative about type functions, which can arbitrarily increase the size of a type. Hence the new datatype TypeSize, which has a TSBig constructor for "arbitrarily big". * instDFunType moves from TcSMonad to Inst * Interestingly, CmmNode and CmmExpr both now need a non-silent (Ord r) in a couple of instance declarations. These were previously silent but must now be explicit. * Quite a bit of wibbling in error messages
* Some Dwarf generation fixesPeter Wortmann2014-12-181-4/+7
| | | | | | | | - Make abbrev offset absolute on Non-Mac systems - Add another termination byte at the end of the abbrev section (readelf complains) - Scope combination was wrong for the simpler cases - Shouldn't have a "global/" in front of all scopes
* Add unwind information to CmmPeter Wortmann2014-12-161-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Unwind information allows the debugger to discover more information about a program state, by allowing it to "reconstruct" other states of the program. In practice, this means that we explain to the debugger how to unravel stack frames, which comes down mostly to explaining how to find their Sp and Ip register values. * We declare yet another new constructor for CmmNode - and this time there's actually little choice, as unwind information can and will change mid-block. We don't actually make use of these capabilities, and back-end support would be tricky (generate new labels?), but it feels like the right way to do it. * Even though we only use it for Sp so far, we allow CmmUnwind to specify unwind information for any register. This is pretty cheap and could come in useful in future. * We allow full CmmExpr expressions for specifying unwind values. The advantage here is that we don't have to make up new syntax, and can e.g. use the WDS macro directly. On the other hand, the back-end will now have to simplify the expression until it can sensibly be converted into DWARF byte code - a process which might fail, yielding NCG panics. On the other hand, when you're writing Cmm by hand you really ought to know what you're doing. (From Phabricator D169)
* Tick scopesPeter Wortmann2014-12-161-6/+129
| | | | | | | | | | | | | | | | | | | | | | This patch solves the scoping problem of CmmTick nodes: If we just put CmmTicks into blocks we have no idea what exactly they are meant to cover. Here we introduce tick scopes, which allow us to create sub-scopes and merged scopes easily. Notes: * Given that the code often passes Cmm around "head-less", we have to make sure that its intended scope does not get lost. To keep the amount of passing-around to a minimum we define a CmmAGraphScoped type synonym here that just bundles the scope with a portion of Cmm to be assembled later. * We introduce new scopes at somewhat random places, aligning with getCode calls. This works surprisingly well, but we might have to add new scopes into the mix later on if we find things too be too coarse-grained. (From Phabricator D169)
* Source notes (Cmm support)Peter Wortmann2014-12-161-1/+13
| | | | | | | | | | | | | | | | | | | | This patch adds CmmTick nodes to Cmm code. This is relatively straight-forward, but also not very useful, as many blocks will simply end up with no annotations whatosever. Notes: * We use this design over, say, putting ticks into the entry node of all blocks, as it seems to work better alongside existing optimisations. Now granted, the reason for this is that currently GHC's main Cmm optimisations seem to mainly reorganize and merge code, so this might change in the future. * We have the Cmm parser generate a few source notes as well. This is relatively easy to do - worst part is that it complicates the CmmParse implementation a bit. (From Phabricator D169)
* Add LANGUAGE pragmas to compiler/ source filesHerbert Valerio Riedel2014-05-151-1/+6
| | | | | | | | | | | | | | | | | | In some cases, the layout of the LANGUAGE/OPTIONS_GHC lines has been reorganized, while following the convention, to - place `{-# LANGUAGE #-}` pragmas at the top of the source file, before any `{-# OPTIONS_GHC #-}`-lines. - Moreover, if the list of language extensions fit into a single `{-# LANGUAGE ... -#}`-line (shorter than 80 characters), keep it on one line. Otherwise split into `{-# LANGUAGE ... -#}`-lines for each individual language extension. In both cases, try to keep the enumeration alphabetically ordered. (The latter layout is preferable as it's more diff-friendly) While at it, this also replaces obsolete `{-# OPTIONS ... #-}` pragma occurences by `{-# OPTIONS_GHC ... #-}` pragmas.
* Clarify comments and liberalise stack-check optimisation slightlySimon Peyton Jones2013-10-181-1/+5
| | | | | | The only substantive change here is to change "==" into ">=" in the Note [Always false stack check] code. This is semantically correct, but won't have any practical impact.
* Fix definition of DefinerOfRegs for CmmForeignCallJan Stolarek2013-09-041-5/+74
| | | | And update comments
* Comments and type synonym in CmmSinkJan Stolarek2013-09-031-0/+2
|
* Fix a bug in stack layout with safe foreign calls (#8083)Simon Marlow2013-07-241-5/+6
| | | | | | | We weren't properly tracking the number of stack arguments in the continuation of a foreign call. It happened to work when the continuation was not a join point, but when it was a join point we were using the wrong amount of stack fixup.
* Whitespace only in CmmNodeIan Lynagh2013-04-141-21/+14
|
* Derive instance Eq for CmmNodeGabor Greif2013-04-061-14/+3
|
* commentsSimon Marlow2013-03-051-2/+3
|
* Remove OldCmm, convert backends to consume new CmmSimon Marlow2012-11-121-1/+13
| | | | | | | | | | | | | | | | | | This removes the OldCmm data type and the CmmCvt pass that converts new Cmm to OldCmm. The backends (NCGs, LLVM and C) have all been converted to consume new Cmm. The main difference between the two data types is that conditional branches in new Cmm have both true/false successors, whereas in OldCmm the false case was a fallthrough. To generate slightly better code we occasionally need to invert a conditional to ensure that the branch-not-taken becomes a fallthrough; this was previously done in CmmCvt, and it is now done in CmmContFlowOpt. We could go further and use the Hoopl Block representation for native code, which would mean that we could use Hoopl's postorderDfs and analyses for native code, but for now I've left it as is, using the old ListGraph representation for native code.
* Generalize register sets and liveness calculations.Geoffrey Mainland2012-10-301-11/+49
| | | | | | We would like to calculate register liveness for global registers as well as local registers, so this patch generalizes the existing infrastructure to set the stage.
* Produce new-style Cmm from the Cmm parserSimon Marlow2012-10-081-14/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main change here is that the Cmm parser now allows high-level cmm code with argument-passing and function calls. For example: foo ( gcptr a, bits32 b ) { if (b > 0) { // we can make tail calls passing arguments: jump stg_ap_0_fast(a); } return (x,y); } More details on the new cmm syntax are in Note [Syntax of .cmm files] in CmmParse.y. The old syntax is still more-or-less supported for those occasional code fragments that really need to explicitly manipulate the stack. However there are a couple of differences: it is now obligatory to give a list of live GlobalRegs on every jump, e.g. jump %ENTRY_CODE(Sp(0)) [R1]; Again, more details in Note [Syntax of .cmm files]. I have rewritten most of the .cmm files in the RTS into the new syntax, except for AutoApply.cmm which is generated by the genapply program: this file could be generated in the new syntax instead and would probably be better off for it, but I ran out of enthusiasm. Some other changes in this batch: - The PrimOp calling convention is gone, primops now use the ordinary NativeNodeCall convention. This means that primops and "foreign import prim" code must be written in high-level cmm, but they can now take more than 10 arguments. - CmmSink now does constant-folding (should fix #7219) - .cmm files now go through the cmmPipeline, and as a result we generate better code in many cases. All the object files generated for the RTS .cmm files are now smaller. Performance should be better too, but I haven't measured it yet. - RET_DYN frames are removed from the RTS, lots of code goes away - we now have some more canned GC points to cover unboxed-tuples with 2-4 pointers, which will reduce code size a little.
* Fix a bug in foldExpDeepSimon Marlow2012-08-311-8/+1
| | | | | This caused the CAF analysis to occasionally miss a CAF sometimes, resulting in a very hard to diagnose crash.
* Foreign calls may clobber caller-saves registersSimon Marlow2012-08-061-8/+28
| | | | See Note [foreign calls clobber GlobalRegs]
* GHC 7.4 is now required for building HEADIan Lynagh2012-07-201-6/+0
|
* Track liveness of GlobalRegs in the new code generatorSimon Marlow2012-07-091-11/+11
| | | | | | This gives the register allocator access to R1.., F1.., D1.. etc. for the new code generator, and is a cheap way to eliminate all the extra "x = R1" assignments that we get from copyIn.
* Delete some unused codeSimon Marlow2012-07-051-4/+0
|
* Remove "fuel", adapt to Hoopl changes, fix warningsSimon Marlow2012-07-051-1/+1
|
* a bit more UNPACKingSimon Marlow2012-03-151-4/+7
|
* remove dead codeSimon Marlow2012-03-151-31/+0
|
* remove unused Conventions (Foreign, Private)Simon Marlow2012-02-131-8/+0
|
* New stack layout algorithmSimon Marlow2012-02-081-2/+2
| | | | | | | | | | | | | Also: - improvements to code generation: push slow-call continuations on the stack instead of generating explicit continuations - remove unused CmmInfo wrapper type (replace with CmmInfoTable) - squash Area and AreaId together, remove now-unused RegSlot - comment out old unused stack-allocation code that no longer compiles after removal of RegSlot
* add mapSuccessorsSimon Marlow2012-02-031-1/+9
|
* don't inline foldExpDeepSimon Marlow2012-01-301-1/+0
|
* optimise foldExpDeepSimon Marlow2012-01-251-1/+10
|
* strictness annotationsSimon Marlow2012-01-231-2/+2
|
* unpack the Label in CmmEntrySimon Marlow2012-01-181-1/+1
|
* More codegen refactoring with simonpjSimon Marlow2011-12-191-0/+5
|
* Use -fwarn-tabs when validatingIan Lynagh2011-11-041-0/+7
| | | | | We only use it for "compiler" sources, i.e. not for libraries. Many modules have a -fno-warn-tabs kludge for now.
* Snapshot of codegen refactoring to share with simonpjSimon Marlow2011-08-251-8/+15
|