summaryrefslogtreecommitdiff
path: root/compiler/cmm/Hoopl/Dataflow.hs
Commit message (Collapse)AuthorAgeFilesLines
* Module hierarchy: Cmm (cf #13009)Sylvain Henry2020-01-251-441/+0
|
* Add loop level analysis to the NCG backend.klebinger.andreas@gmx.at2019-10-161-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | For backends maintaining the CFG during codegen we can now find loops and their nesting level. This is based on the Cmm CFG and dominator analysis. As a result we can estimate edge frequencies a lot better for methods, resulting in far better code layout. Speedup on nofib: ~1.5% Increase in compile times: ~1.9% To make this feasible this commit adds: * Dominator analysis based on the Lengauer-Tarjan Algorithm. * An algorithm estimating global edge frequences from branch probabilities - In CFG.hs A few static branch prediction heuristics: * Expect to take the backedge in loops. * Expect to take the branch NOT exiting a loop. * Expect integer vs constant comparisons to be false. We also treat heap/stack checks special for branch prediction to avoid them being treated as loops.
* Make the C-- O and C types constructors with DataKindsJohn Ericson2019-09-051-1/+3
| | | | | The tightens up the kinds a bit. I use type synnonyms to avoid adding promotion ticks everywhere.
* Replace most occurences of foldl with foldl'.klebinger.andreas@gmx.at2018-08-211-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds foldl' to GhcPrelude and changes must occurences of foldl to foldl'. This leads to better performance especially for quick builds where GHC does not perform strictness analysis. It does change strictness behaviour when we use foldl' to turn a argument list into function applications. But this is only a drawback if code looks ONLY at the last argument but not at the first. And as the benchmarks show leads to fewer allocations in practice at O2. Compiler performance for Nofib: O2 Allocations: -1 s.d. ----- -0.0% +1 s.d. ----- -0.0% Average ----- -0.0% O2 Compile Time: -1 s.d. ----- -2.8% +1 s.d. ----- +1.3% Average ----- -0.8% O0 Allocations: -1 s.d. ----- -0.2% +1 s.d. ----- -0.1% Average ----- -0.2% Test Plan: ci Reviewers: goldfire, bgamari, simonmar, tdammers, monoidal Reviewed By: bgamari, monoidal Subscribers: tdammers, rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4929
* An overhaul of the SRT representationSimon Marlow2018-05-161-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - Previously we would hvae a single big table of pointers per module, with a set of bitmaps to reference entries within it. The new representation is identical to a static constructor, which is much simpler for the GC to traverse, and we get to remove the complicated bitmap-traversal code from the GC. - Rewrite all the code to generate SRTs in CmmBuildInfoTables, and document it much better (see Note [SRTs]). This has been something I've wanted to do since we moved to the new code generator, I finally had the opportunity to finish it while on a transatlantic flight recently :) There are a series of 4 diffs: 1. D4632 (this one), which does the bulk of the changes 2. D4633 which adds support for smaller `CmmLabelDiffOff` constants 3. D4634 which takes advantage of D4632 and D4633 to save a word in info tables that have an SRT on x86_64. This is where most of the binary size improvement comes from. 4. D4637 which makes a further optimisation to merge some SRTs with static FUN closures. This adds some complexity and the benefits are fairly modest, so it's not clear yet whether we should do this. Results (after (3), on x86_64) - GHC itself (staticaly linked) is 5.2% smaller - -1.7% binary sizes in nofib, -2.9% module sizes. Full nofib results: P176 - I measured the overhead of traversing all the static objects in a major GC in GHC itself by doing `replicateM_ 1000 performGC` as the first thing in `Main.main`. The new version was 5-10% faster, but the results did vary quite a bit. - I'm not sure if there's a compile-time difference, the results are too unreliable. Test Plan: validate Reviewers: bgamari, michalt, niteria, simonpj, erikd, osa1 Subscribers: thomie, carter Differential Revision: https://phabricator.haskell.org/D4632
* Hoopl: improve postorder calculationMichal Terepeta2018-03-191-18/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Fix the naming and comments to indicate that we are calculating *reverse* postorder (and not the standard postorder). - Rewrite the calculation to avoid CPS code. I found it fairly difficult to understand and the new one seems faster (according to nofib, decreases compiler allocations by 0.2%) - Remove `LabelsPtr`, which seems unnecessary and could be *really* confusing. For instance, previously: `postorder_dfs_from <block with label X>` and `postorder_dfs_from <label X>` would actually mean quite different things (and give different results). - Change the `Dataflow` module to always use entry of the graph for reverse postorder calculation. This should be the only change in behavior of this commit. Previously, if the caller provided initial facts for some of the labels, we would use those labels for our postorder calculation. However, I don't think that's correct in general - if the initial facts did not contain the entry of the graph, we would never analyze the blocks reachable from the entry but unreachable from the labels provided with the initial facts. It seems that the only analysis that used this was proc-point analysis, which I think would always include the entry block (so I don't think there's any bug due to this). Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: ./validate Reviewers: bgamari, simonmar Reviewed By: simonmar Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4464
* Hoopl.Collections: change right folds to strict left foldsMichal Terepeta2018-02-021-4/+4
| | | | | | | | | | | | | | | | | | | | | | It seems that most uses of these folds should be strict left folds (I could only find a single place that benefits from a right fold). So this removes the existing `setFold`/`mapFold`/`mapFoldWihKey` replaces them with: - `setFoldl`/`mapFoldl`/`mapFoldlWithKey` (strict left folds) - `setFoldr`/`mapFoldr` (for the less common case where a right fold actually makes sense, e.g., `CmmProcPoint`) Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: ./validate Reviewers: bgamari, simonmar Reviewed By: bgamari Subscribers: rwbarton, thomie, carter, kavon Differential Revision: https://phabricator.haskell.org/D4356
* Use IntSet in DataflowBartosz Nitka2018-01-211-23/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this change, a list was used as a substitute for a heap. This led to quadratic behavior on a simple program (see new test case). This change replaces it with IntSet in effect reverting 5a1a2633553. @simonmar said it's fine to revert as long as nofib results are good. Test Plan: new test case: 20% improvement 3x improvement when N=10000 nofib: I run it twice for before and after because the compile time results are noisy. - Compile Allocations: ``` before before re-run after after re-run -1 s.d. ----- -0.0% -0.1% -0.1% +1 s.d. ----- +0.0% +0.1% +0.1% Average ----- +0.0% -0.0% -0.0% ``` - Compile Time: ``` before before re-run after after re-run -1 s.d. ----- -0.1% -2.3% -2.6% +1 s.d. ----- +5.2% +3.7% +4.4% Average ----- +2.5% +0.7% +0.8% ``` I checked each case and couldn't find consistent slow-down/speed-up on compile time. Full results here: P173 Reviewers: simonpj, simonmar, bgamari Reviewed By: bgamari Subscribers: rwbarton, thomie, carter, simonmar GHC Trac Issues: #14667 Differential Revision: https://phabricator.haskell.org/D4329
* compiler: introduce custom "GhcPrelude" PreludeHerbert Valerio Riedel2017-09-191-0/+2
| | | | | | | | | | | | | | | | | | This switches the compiler/ component to get compiled with -XNoImplicitPrelude and a `import GhcPrelude` is inserted in all modules. This is motivated by the upcoming "Prelude" re-export of `Semigroup((<>))` which would cause lots of name clashes in every modulewhich imports also `Outputable` Reviewers: austin, goldfire, bgamari, alanz, simonmar Reviewed By: bgamari Subscribers: goldfire, rwbarton, thomie, mpickering, bgamari Differential Revision: https://phabricator.haskell.org/D3989
* Hoopl: remove dependency on Hoopl packageMichal Terepeta2017-06-231-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This copies the subset of Hoopl's functionality needed by GHC to `cmm/Hoopl` and removes the dependency on the Hoopl package. The main motivation for this change is the confusing/noisy interface between GHC and Hoopl: - Hoopl has `Label` which is GHC's `BlockId` but different than GHC's `CLabel` - Hoopl has `Unique` which is different than GHC's `Unique` - Hoopl has `Unique{Map,Set}` which are different than GHC's `Uniq{FM,Set}` - GHC has its own specialized copy of `Dataflow`, so `cmm/Hoopl` is needed just to filter the exposed functions (filter out some of the Hoopl's and add the GHC ones) With this change, we'll be able to simplify this significantly. It'll also be much easier to do invasive changes (Hoopl is a public package on Hackage with users that depend on the current behavior) This should introduce no changes in functionality - it merely copies the relevant code. Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: ./validate Reviewers: austin, bgamari, simonmar Reviewed By: bgamari, simonmar Subscribers: simonpj, kavon, rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D3616
* cmm/CmmLayoutStack: avoid generating unnecessary reloadsMichal Terepeta2017-06-191-1/+114
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This tries to be more precise when generating reloads of local registers in proc points. Previously we'd reload all local registers that were live. But we used liveness information that assumed local registers survive native calls. For the purpose of reloading registers this is an overapproximation and might lead to generating huge amounts of unnecessary reloads (in case there's another proc point before the register is used). This change takes the approach of moving the generation of reloads to a second pass over the Cmm, which allows to recompute the liveness and can use the knowledge that local registers do *not* survive calls. This leads to generating only useful reloads. For an extreme example where this helps a lot please see T3294. This should also fix #7198 Finally, this re-introduces the code to do Cmm rewriting using in `Dataflow` module (with the difference that we know operate on a whole block at a time). Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Reviewers: austin, bgamari, simonmar Reviewed By: simonmar Subscribers: kavon, rwbarton, thomie GHC Trac Issues: #7198 Differential Revision: https://phabricator.haskell.org/D3586
* Dataflow: use IntSet for mkDepBlocksMichal Terepeta2017-05-081-24/+36
| | | | | | | | | | | | | | | | | | | | | | Using `IntSet` instead of `[Int]` is nicer since it gets rid of appending to a list (in the backward case) and folding over it is ordered. I also added a comment about how `mkDepBlocks` works since its behavior can be a bit surprising at first sight (it took me some time to see that it's doing the right thing ;) Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: ./validate Reviewers: austin, bgamari, simonmar Reviewed By: bgamari Subscribers: rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D3530
* BlockId: remove BlockMap and BlockSet synonymsMichal Terepeta2016-12-081-1/+0
| | | | | | | | | | | | | | | | | | | | This continues removal of `BlockId` module in favor of Hoopl's `Label`. Most of the changes here are mechanical, apart from the orphan `Outputable` instances for `LabelMap` and `LabelSet`. For now I've moved them to `cmm/Hoopl`, since it's already trying to manage all imports from Hoopl (to avoid any collisions). Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: validate Reviewers: bgamari, austin, simonmar Reviewed By: simonmar Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2800
* Hoopl/Dataflow: use block-oriented interfaceMichal Terepeta2016-11-291-208/+77
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This introduces the new interface for dataflow analysis, where transfer functions operate on a whole basic block. The main changes are: - Hoopl.Dataflow: implement the new interface and remove the old code; expose a utility function to do a strict fold over the nodes of a basic block (for analyses that do want to look at all the nodes) - Refactor all the analyses to use the new interface. One of the nice effects is that we can remove the `analyzeFwdBlocks` hack that ignored the middle nodes (that existed for analyses that didn't need to go over all the nodes). Now this is no longer a special case and fits well with the new interface. Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: validate, earlier version of the patch had assertions comparing the results with the old implementation Reviewers: erikd, austin, simonmar, hvr, goldfire, bgamari Reviewed By: bgamari Subscribers: goldfire, erikd, thomie Differential Revision: https://phabricator.haskell.org/D2754
* Hoopl/Dataflow: make the module more self-containedMichal Terepeta2016-11-021-6/+62
| | | | | | | | | | | | | | | | | | | | | | | | | This makes the GHC's Dataflow module more self-contained by also forking the `DataflowLattice` (instead of only the analysis algorithm). Effects/benefits: - We no longer need to use the deprecated Hoopl functions (and can remove `-fno-warn-warnings-deprecations` from two modules). - We can remove the unnecessary `Label` parameter of `JoinFun` (already ignored in all our implementations). - We no longer mix Hoopl's `Dataflow` module and GHC's one. - We can replace some calls to lazy folds in Hoopl with the strict ones (see `joinOutFacts` and `mkFactBase`). Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: validate Reviewers: austin, simonmar, bgamari Reviewed By: bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2660
* CmmUtils: remove the last dataflow functionsMichal Terepeta2016-10-261-29/+52
| | | | | | | | | | | | | | | | | | | | This commit: - Moves the remaining few methods concerned with dataflow analysis from `CmmUtils` to `Hoopl.Dataflow`. - Refactors the code to not use `FwdPass` and simply pass `FwdTransfer` and `DataflowLattice` directly. Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: validate Reviewers: austin, simonmar, bgamari Reviewed By: simonmar, bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2634
* cmm/Hoopl/Dataflow: minor cleanupMichal Terepeta2016-10-221-26/+40
| | | | | | | | | | | | | | | | | This doesn't have any functional changes, it simply removes one unnecessary top binding and improves the comments. Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: ./validate Reviewers: austin, bgamari, simonmar Reviewed By: simonmar Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2619
* cmm/Hoopl/Dataflow: remove unused codeMichal Terepeta2016-10-181-525/+5
| | | | | | | | | | | | | | | | | | | | We had *a lot* of code copied from Hoopl that is for rewriting. But GHC doesn't use it (it only uses some forked Hoopl code for analysis). So we can safely kill all this code and make it much easier to refactor and improve the parts that we do use. Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: ./validate Reviewers: austin, simonmar, bgamari Reviewed By: bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2612
* Add kind equalities to GHC.Richard Eisenberg2015-12-111-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This implements the ideas originally put forward in "System FC with Explicit Kind Equality" (ICFP'13). There are several noteworthy changes with this patch: * We now have casts in types. These change the kind of a type. See new constructor `CastTy`. * All types and all constructors can be promoted. This includes GADT constructors. GADT pattern matches take place in type family equations. In Core, types can now be applied to coercions via the `CoercionTy` constructor. * Coercions can now be heterogeneous, relating types of different kinds. A coercion proving `t1 :: k1 ~ t2 :: k2` proves both that `t1` and `t2` are the same and also that `k1` and `k2` are the same. * The `Coercion` type has been significantly enhanced. The documentation in `docs/core-spec/core-spec.pdf` reflects the new reality. * The type of `*` is now `*`. No more `BOX`. * Users can write explicit kind variables in their code, anywhere they can write type variables. For backward compatibility, automatic inference of kind-variable binding is still permitted. * The new extension `TypeInType` turns on the new user-facing features. * Type families and synonyms are now promoted to kinds. This causes trouble with parsing `*`, leading to the somewhat awkward new `HsAppsTy` constructor for `HsType`. This is dispatched with in the renamer, where the kind `*` can be told apart from a type-level multiplication operator. Without `-XTypeInType` the old behavior persists. With `-XTypeInType`, you need to import `Data.Kind` to get `*`, also known as `Type`. * The kind-checking algorithms in TcHsType have been significantly rewritten to allow for enhanced kinds. * The new features are still quite experimental and may be in flux. * TODO: Several open tickets: #11195, #11196, #11197, #11198, #11203. * TODO: Update user manual. Tickets addressed: #9017, #9173, #7961, #10524, #8566, #11142. Updates Haddock submodule.
* Kill redundant patternsBen Gamari2015-12-031-5/+0
| | | | | George's new exhaustiveness checker now realizes these are impossible. Yay!
* Remove redundant constraints in the compiler itself, found by ↵Simon Peyton Jones2015-01-061-1/+1
| | | | -fwarn-redundant-constraints
* Remove a stray Trustworthy flag in ghc.David Terei2014-11-121-1/+0
|
* Add LANGUAGE pragmas to compiler/ source filesHerbert Valerio Riedel2014-05-151-4/+9
| | | | | | | | | | | | | | | | | | In some cases, the layout of the LANGUAGE/OPTIONS_GHC lines has been reorganized, while following the convention, to - place `{-# LANGUAGE #-}` pragmas at the top of the source file, before any `{-# OPTIONS_GHC #-}`-lines. - Moreover, if the list of language extensions fit into a single `{-# LANGUAGE ... -#}`-line (shorter than 80 characters), keep it on one line. Otherwise split into `{-# LANGUAGE ... -#}`-lines for each individual language extension. In both cases, try to keep the enumeration alphabetically ordered. (The latter layout is preferable as it's more diff-friendly) While at it, this also replaces obsolete `{-# OPTIONS ... #-}` pragma occurences by `{-# OPTIONS_GHC ... #-}` pragmas.
* Remove LANGUAGE pragrams implied by Haskell2010Herbert Valerio Riedel2014-05-141-1/+1
| | | | | | | | | Haskell2010 implies (at least) EmptyDataDecls, ForeignFunctionInterface, PatternGuards, DoAndIfThenElse, and RelaxedPolyRec. This is a follow-up to dd92e2179e3171a0630834b773c08d416101980d Signed-off-by: Herbert Valerio Riedel <hvr@gnu.org>
* GHC 7.4 is now required for building HEADIan Lynagh2012-07-201-7/+0
|
* Fix build with GHC 7.0Ian Lynagh2012-07-131-0/+3
|
* Rename BTail -> BCons, BHead -> BSnocSimon Marlow2012-07-061-8/+8
|
* Remove "fuel", adapt to Hoopl changes, fix warningsSimon Marlow2012-07-051-87/+84
|
* mainly tidyupSimon Marlow2012-07-031-80/+78
|
* some optimisationsSimon Marlow2012-03-151-80/+48
|
* remove SCCsSimon Marlow2012-01-261-1/+1
|
* Use an ordered list for the work list, which is a bit quicker than IntSetSimon Marlow2012-01-251-23/+35
|
* Further optimisations to the fixpoint algorithmSimon Marlow2012-01-251-39/+19
|
* make it compile with earlier GHCsSimon Marlow2012-01-231-6/+9
|
* snapshot of latest improvementsSimon Marlow2012-01-231-67/+109
|
* snapshot: fastest version so farSimon Marlow2012-01-201-75/+121
|
* add missing filesSimon Marlow2012-01-171-0/+841