| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For backends maintaining the CFG during codegen
we can now find loops and their nesting level.
This is based on the Cmm CFG and dominator analysis.
As a result we can estimate edge frequencies a lot better
for methods, resulting in far better code layout.
Speedup on nofib: ~1.5%
Increase in compile times: ~1.9%
To make this feasible this commit adds:
* Dominator analysis based on the Lengauer-Tarjan Algorithm.
* An algorithm estimating global edge frequences from branch
probabilities - In CFG.hs
A few static branch prediction heuristics:
* Expect to take the backedge in loops.
* Expect to take the branch NOT exiting a loop.
* Expect integer vs constant comparisons to be false.
We also treat heap/stack checks special for branch prediction
to avoid them being treated as loops.
|
|
|
|
|
| |
The tightens up the kinds a bit. I use type synnonyms to avoid adding
promotion ticks everywhere.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds foldl' to GhcPrelude and changes must occurences
of foldl to foldl'. This leads to better performance especially
for quick builds where GHC does not perform strictness analysis.
It does change strictness behaviour when we use foldl' to turn
a argument list into function applications. But this is only a
drawback if code looks ONLY at the last argument but not at the first.
And as the benchmarks show leads to fewer allocations in practice
at O2.
Compiler performance for Nofib:
O2 Allocations:
-1 s.d. ----- -0.0%
+1 s.d. ----- -0.0%
Average ----- -0.0%
O2 Compile Time:
-1 s.d. ----- -2.8%
+1 s.d. ----- +1.3%
Average ----- -0.8%
O0 Allocations:
-1 s.d. ----- -0.2%
+1 s.d. ----- -0.1%
Average ----- -0.2%
Test Plan: ci
Reviewers: goldfire, bgamari, simonmar, tdammers, monoidal
Reviewed By: bgamari, monoidal
Subscribers: tdammers, rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4929
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Previously we would hvae a single big table of pointers per module,
with a set of bitmaps to reference entries within it. The new
representation is identical to a static constructor, which is much
simpler for the GC to traverse, and we get to remove the complicated
bitmap-traversal code from the GC.
- Rewrite all the code to generate SRTs in CmmBuildInfoTables, and
document it much better (see Note [SRTs]). This has been something
I've wanted to do since we moved to the new code generator, I
finally had the opportunity to finish it while on a transatlantic
flight recently :)
There are a series of 4 diffs:
1. D4632 (this one), which does the bulk of the changes
2. D4633 which adds support for smaller `CmmLabelDiffOff` constants
3. D4634 which takes advantage of D4632 and D4633 to save a word in
info tables that have an SRT on x86_64. This is where most of the
binary size improvement comes from.
4. D4637 which makes a further optimisation to merge some SRTs with
static FUN closures. This adds some complexity and the benefits
are fairly modest, so it's not clear yet whether we should do this.
Results (after (3), on x86_64)
- GHC itself (staticaly linked) is 5.2% smaller
- -1.7% binary sizes in nofib, -2.9% module sizes. Full nofib results: P176
- I measured the overhead of traversing all the static objects in a
major GC in GHC itself by doing `replicateM_ 1000 performGC` as the
first thing in `Main.main`. The new version was 5-10% faster, but
the results did vary quite a bit.
- I'm not sure if there's a compile-time difference, the results are
too unreliable.
Test Plan: validate
Reviewers: bgamari, michalt, niteria, simonpj, erikd, osa1
Subscribers: thomie, carter
Differential Revision: https://phabricator.haskell.org/D4632
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Fix the naming and comments to indicate that we are calculating
*reverse* postorder (and not the standard postorder).
- Rewrite the calculation to avoid CPS code. I found it fairly
difficult to understand and the new one seems faster (according to
nofib, decreases compiler allocations by 0.2%)
- Remove `LabelsPtr`, which seems unnecessary and could be *really*
confusing. For instance, previously:
`postorder_dfs_from <block with label X>`
and
`postorder_dfs_from <label X>`
would actually mean quite different things (and give different
results).
- Change the `Dataflow` module to always use entry of the graph for
reverse postorder calculation. This should be the only change in
behavior of this commit.
Previously, if the caller provided initial facts for some of the
labels, we would use those labels for our postorder calculation.
However, I don't think that's correct in general - if the initial
facts did not contain the entry of the graph, we would never analyze
the blocks reachable from the entry but unreachable from the labels
provided with the initial facts. It seems that the only analysis that
used this was proc-point analysis, which I think would always include
the entry block (so I don't think there's any bug due to this).
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: ./validate
Reviewers: bgamari, simonmar
Reviewed By: simonmar
Subscribers: rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4464
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It seems that most uses of these folds should be strict left folds
(I could only find a single place that benefits from a right fold).
So this removes the existing `setFold`/`mapFold`/`mapFoldWihKey`
replaces them with:
- `setFoldl`/`mapFoldl`/`mapFoldlWithKey` (strict left folds)
- `setFoldr`/`mapFoldr` (for the less common case where a right fold
actually makes sense, e.g., `CmmProcPoint`)
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: ./validate
Reviewers: bgamari, simonmar
Reviewed By: bgamari
Subscribers: rwbarton, thomie, carter, kavon
Differential Revision: https://phabricator.haskell.org/D4356
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before this change, a list was used as a substitute for a heap.
This led to quadratic behavior on a simple program (see new
test case).
This change replaces it with IntSet in effect reverting
5a1a2633553. @simonmar said it's fine to revert as long as nofib
results are good.
Test Plan:
new test case:
20% improvement
3x improvement when N=10000
nofib:
I run it twice for before and after because the compile time
results are noisy.
- Compile Allocations:
```
before before re-run after after re-run
-1 s.d. ----- -0.0% -0.1% -0.1%
+1 s.d. ----- +0.0% +0.1% +0.1%
Average ----- +0.0% -0.0% -0.0%
```
- Compile Time:
```
before before re-run after after re-run
-1 s.d. ----- -0.1% -2.3% -2.6%
+1 s.d. ----- +5.2% +3.7% +4.4%
Average ----- +2.5% +0.7% +0.8%
```
I checked each case and couldn't find consistent slow-down/speed-up on
compile time. Full results here: P173
Reviewers: simonpj, simonmar, bgamari
Reviewed By: bgamari
Subscribers: rwbarton, thomie, carter, simonmar
GHC Trac Issues: #14667
Differential Revision: https://phabricator.haskell.org/D4329
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This switches the compiler/ component to get compiled with
-XNoImplicitPrelude and a `import GhcPrelude` is inserted in all
modules.
This is motivated by the upcoming "Prelude" re-export of
`Semigroup((<>))` which would cause lots of name clashes in every
modulewhich imports also `Outputable`
Reviewers: austin, goldfire, bgamari, alanz, simonmar
Reviewed By: bgamari
Subscribers: goldfire, rwbarton, thomie, mpickering, bgamari
Differential Revision: https://phabricator.haskell.org/D3989
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This copies the subset of Hoopl's functionality needed by GHC to
`cmm/Hoopl` and removes the dependency on the Hoopl package.
The main motivation for this change is the confusing/noisy interface
between GHC and Hoopl:
- Hoopl has `Label` which is GHC's `BlockId` but different than
GHC's `CLabel`
- Hoopl has `Unique` which is different than GHC's `Unique`
- Hoopl has `Unique{Map,Set}` which are different than GHC's
`Uniq{FM,Set}`
- GHC has its own specialized copy of `Dataflow`, so `cmm/Hoopl` is
needed just to filter the exposed functions (filter out some of the
Hoopl's and add the GHC ones)
With this change, we'll be able to simplify this significantly.
It'll also be much easier to do invasive changes (Hoopl is a public
package on Hackage with users that depend on the current behavior)
This should introduce no changes in functionality - it merely
copies the relevant code.
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: ./validate
Reviewers: austin, bgamari, simonmar
Reviewed By: bgamari, simonmar
Subscribers: simonpj, kavon, rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D3616
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This tries to be more precise when generating reloads of local
registers in proc points. Previously we'd reload all local registers
that were live. But we used liveness information that assumed local
registers survive native calls. For the purpose of reloading registers
this is an overapproximation and might lead to generating huge amounts
of unnecessary reloads (in case there's another proc point before the
register is used).
This change takes the approach of moving the generation of reloads to
a second pass over the Cmm, which allows to recompute the liveness and
can use the knowledge that local registers do *not* survive calls.
This leads to generating only useful reloads. For an extreme example
where this helps a lot please see T3294. This should also fix #7198
Finally, this re-introduces the code to do Cmm rewriting using in
`Dataflow` module (with the difference that we know operate on a whole
block at a time).
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Reviewers: austin, bgamari, simonmar
Reviewed By: simonmar
Subscribers: kavon, rwbarton, thomie
GHC Trac Issues: #7198
Differential Revision: https://phabricator.haskell.org/D3586
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using `IntSet` instead of `[Int]` is nicer since it gets rid of
appending to a list (in the backward case) and folding over it is
ordered.
I also added a comment about how `mkDepBlocks` works since its
behavior can be a bit surprising at first sight (it took me some time
to see that it's doing the right thing ;)
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: ./validate
Reviewers: austin, bgamari, simonmar
Reviewed By: bgamari
Subscribers: rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D3530
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This continues removal of `BlockId` module in favor of Hoopl's `Label`.
Most of the changes here are mechanical, apart from the orphan
`Outputable` instances for `LabelMap` and `LabelSet`. For now I've
moved them to `cmm/Hoopl`, since it's already trying to manage all
imports from Hoopl (to avoid any collisions).
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: validate
Reviewers: bgamari, austin, simonmar
Reviewed By: simonmar
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2800
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This introduces the new interface for dataflow analysis, where transfer
functions operate on a whole basic block.
The main changes are:
- Hoopl.Dataflow: implement the new interface and remove the old code;
expose a utility function to do a strict fold over the nodes of a
basic block (for analyses that do want to look at all the nodes)
- Refactor all the analyses to use the new interface.
One of the nice effects is that we can remove the `analyzeFwdBlocks`
hack that ignored the middle nodes (that existed for analyses that
didn't need to go over all the nodes). Now this is no longer a special
case and fits well with the new interface.
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan:
validate, earlier version of the patch had assertions
comparing the results with the old implementation
Reviewers: erikd, austin, simonmar, hvr, goldfire, bgamari
Reviewed By: bgamari
Subscribers: goldfire, erikd, thomie
Differential Revision: https://phabricator.haskell.org/D2754
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes the GHC's Dataflow module more self-contained by also
forking the `DataflowLattice` (instead of only the analysis
algorithm). Effects/benefits:
- We no longer need to use the deprecated Hoopl functions (and can
remove `-fno-warn-warnings-deprecations` from two modules).
- We can remove the unnecessary `Label` parameter of `JoinFun` (already
ignored in all our implementations).
- We no longer mix Hoopl's `Dataflow` module and GHC's one.
- We can replace some calls to lazy folds in Hoopl with the strict ones
(see `joinOutFacts` and `mkFactBase`).
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: validate
Reviewers: austin, simonmar, bgamari
Reviewed By: bgamari
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2660
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit:
- Moves the remaining few methods concerned with dataflow analysis
from `CmmUtils` to `Hoopl.Dataflow`.
- Refactors the code to not use `FwdPass` and simply pass `FwdTransfer`
and `DataflowLattice` directly.
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: validate
Reviewers: austin, simonmar, bgamari
Reviewed By: simonmar, bgamari
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2634
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This doesn't have any functional changes, it simply removes one
unnecessary top binding and improves the comments.
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: ./validate
Reviewers: austin, bgamari, simonmar
Reviewed By: simonmar
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2619
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We had *a lot* of code copied from Hoopl that is for rewriting. But GHC
doesn't use it (it only uses some forked Hoopl code for analysis).
So we can safely kill all this code and make it much easier to refactor
and improve the parts that we do use.
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: ./validate
Reviewers: austin, simonmar, bgamari
Reviewed By: bgamari
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2612
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This implements the ideas originally put forward in
"System FC with Explicit Kind Equality" (ICFP'13).
There are several noteworthy changes with this patch:
* We now have casts in types. These change the kind
of a type. See new constructor `CastTy`.
* All types and all constructors can be promoted.
This includes GADT constructors. GADT pattern matches
take place in type family equations. In Core,
types can now be applied to coercions via the
`CoercionTy` constructor.
* Coercions can now be heterogeneous, relating types
of different kinds. A coercion proving `t1 :: k1 ~ t2 :: k2`
proves both that `t1` and `t2` are the same and also that
`k1` and `k2` are the same.
* The `Coercion` type has been significantly enhanced.
The documentation in `docs/core-spec/core-spec.pdf` reflects
the new reality.
* The type of `*` is now `*`. No more `BOX`.
* Users can write explicit kind variables in their code,
anywhere they can write type variables. For backward compatibility,
automatic inference of kind-variable binding is still permitted.
* The new extension `TypeInType` turns on the new user-facing
features.
* Type families and synonyms are now promoted to kinds. This causes
trouble with parsing `*`, leading to the somewhat awkward new
`HsAppsTy` constructor for `HsType`. This is dispatched with in
the renamer, where the kind `*` can be told apart from a
type-level multiplication operator. Without `-XTypeInType` the
old behavior persists. With `-XTypeInType`, you need to import
`Data.Kind` to get `*`, also known as `Type`.
* The kind-checking algorithms in TcHsType have been significantly
rewritten to allow for enhanced kinds.
* The new features are still quite experimental and may be in flux.
* TODO: Several open tickets: #11195, #11196, #11197, #11198, #11203.
* TODO: Update user manual.
Tickets addressed: #9017, #9173, #7961, #10524, #8566, #11142.
Updates Haddock submodule.
|
|
|
|
|
| |
George's new exhaustiveness checker now realizes these are impossible.
Yay!
|
|
|
|
| |
-fwarn-redundant-constraints
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In some cases, the layout of the LANGUAGE/OPTIONS_GHC lines has been
reorganized, while following the convention, to
- place `{-# LANGUAGE #-}` pragmas at the top of the source file, before
any `{-# OPTIONS_GHC #-}`-lines.
- Moreover, if the list of language extensions fit into a single
`{-# LANGUAGE ... -#}`-line (shorter than 80 characters), keep it on one
line. Otherwise split into `{-# LANGUAGE ... -#}`-lines for each
individual language extension. In both cases, try to keep the
enumeration alphabetically ordered.
(The latter layout is preferable as it's more diff-friendly)
While at it, this also replaces obsolete `{-# OPTIONS ... #-}` pragma
occurences by `{-# OPTIONS_GHC ... #-}` pragmas.
|
|
|
|
|
|
|
|
|
| |
Haskell2010 implies (at least) EmptyDataDecls, ForeignFunctionInterface,
PatternGuards, DoAndIfThenElse, and RelaxedPolyRec.
This is a follow-up to dd92e2179e3171a0630834b773c08d416101980d
Signed-off-by: Herbert Valerio Riedel <hvr@gnu.org>
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|