| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
| |
These kinds of imports are necessary in some cases such as
importing instances of typeclasses or intentionally creating
dependencies in the build system, but '-Wunused-imports' can't
detect when they are no longer needed. This commit removes the
unused ones currently in the code base (not including test files
or submodules), with the hope that doing so may increase
parallelism in the build system by removing unnecessary
dependencies.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes a fairly long-standing bug (dating back to 2015) in
RdrName.bestImport, namely
commit 9376249b6b78610db055a10d05f6592d6bbbea2f
Author: Simon Peyton Jones <simonpj@microsoft.com>
Date: Wed Oct 28 17:16:55 2015 +0000
Fix unused-import stuff in a better way
In that patch got the sense of the comparison back to front, and
thereby failed to implement the unused-import rules described in
Note [Choosing the best import declaration] in RdrName
This led to Trac #13064 and #15393
Fixing this bug revealed a bunch of unused imports in libraries;
the ones in the GHC repo are part of this commit.
The two important changes are
* Fix the bug in bestImport
* Modified the rules by adding (a) in
Note [Choosing the best import declaration] in RdrName
Reason: the previosu rules made Trac #5211 go bad again. And
the new rule (a) makes sense to me.
In unravalling this I also ended up doing a few other things
* Refactor RnNames.ImportDeclUsage to use a [GlobalRdrElt] for the
things that are used, rather than [AvailInfo]. This is simpler
and more direct.
* Rename greParentName to greParent_maybe, to follow GHC
naming conventions
* Delete dead code RdrName.greUsedRdrName
Bumps a few submodules.
Reviewers: hvr, goldfire, bgamari, simonmar, jrtc27
Subscribers: rwbarton, carter
Differential Revision: https://phabricator.haskell.org/D5312
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Use toBlockList instead of revPostorder.
Block elimination works on a given Cmm graph by:
* Getting a list of blocks.
* Looking for duplicates in these blocks.
* Removing all but one instance of duplicates.
There are two (reasonable) ways to get the list of blocks.
* The fast way: `toBlockList`
This just flattens the underlying map into a list.
* The convenient way: `revPostorder`
Start at the entry label, scan for reachable blocks and return
only these. This has the advantage of removing all dead code.
If there is dead code the later is better. Work done on unreachable
blocks is clearly wasted work. However by the point we run the
common block elimination pass the input graph already had all dead code
removed. This is done during control flow optimization in
CmmContFlowOpt which is our first Cmm pass.
This means common block elimination is free to use toBlockList
because revPostorder would return the same blocks. (Although in
a different order).
* Change the triemap used for grouping by a label list
from `(TM.ListMap UniqDFM)` to `ListMap (GenMap LabelMap)`.
* Using GenMap offers leaf compression. Which is a trie
optimization described by the Note [Compressed TrieMap] in
CoreSyn/TrieMap.hs
* Using LabelMap removes the overhead associated with UniqDFM.
This is deterministic since if we have the same input keys the same
LabelMap will be constructed.
Test Plan: ci, profiling output
Reviewers: bgamari, simonmar
Reviewed By: bgamari
Subscribers: dfeuer, thomie, carter
GHC Trac Issues: #15103
Differential Revision: https://phabricator.haskell.org/D4597
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This change makes it possible to generate a static 32-bit relative label
offset on x86_64. Currently we can only generate word-sized label
offsets.
This will be used in D4634 to shrink info tables. See D4632 for more
details.
Test Plan: See D4632
Reviewers: bgamari, niteria, michalt, erikd, jrtc27, osa1
Subscribers: thomie, carter
Differential Revision: https://phabricator.haskell.org/D4633
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit d5c4d46a62ce6a0cfa6440344f707136eff18119.
Please see #14989 for details.
Test Plan: ./validate
Reviewers: bgamari, simonmar
Subscribers: thomie, carter
GHC Trac Issues: #14989
Differential Revision: https://phabricator.haskell.org/D4577
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The sinking pass often gets rid of unnecessary registers
registers/assignements exposing more opportunities for CBE, so this
commit adds a second round of CBE after the sinking pass and should
fix #12915 (and some examples in #14226).
Nofib results:
* Binary size: 0.9% reduction on average
* Compile allocations: 0.7% increase on average
* Runtime: noisy, two separate runs of nofib showed a tiny
reduction on average, (~0.2-0.3%), but I think
this is mostly noise
* Compile time: very noisy, but generally within +/- 0.5% (one
run faster, one slower)
One interesting part of this change is that running CBE invalidates
results of proc-point analysis. But instead of re-doing the whole
analysis, we can use the map that CBE creates for replacing/comparing
block labels (maps a redundant label to a useful one) to update the
results of proc-point analysis. This lowers the overhead compared to the
previous experiment in #12915.
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: ./validate
Reviewers: bgamari, simonmar
Reviewed By: simonmar
Subscribers: rwbarton, thomie, carter
GHC Trac Issues: #12915, #14226
Differential Revision: https://phabricator.haskell.org/D4417
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Fix the naming and comments to indicate that we are calculating
*reverse* postorder (and not the standard postorder).
- Rewrite the calculation to avoid CPS code. I found it fairly
difficult to understand and the new one seems faster (according to
nofib, decreases compiler allocations by 0.2%)
- Remove `LabelsPtr`, which seems unnecessary and could be *really*
confusing. For instance, previously:
`postorder_dfs_from <block with label X>`
and
`postorder_dfs_from <label X>`
would actually mean quite different things (and give different
results).
- Change the `Dataflow` module to always use entry of the graph for
reverse postorder calculation. This should be the only change in
behavior of this commit.
Previously, if the caller provided initial facts for some of the
labels, we would use those labels for our postorder calculation.
However, I don't think that's correct in general - if the initial
facts did not contain the entry of the graph, we would never analyze
the blocks reachable from the entry but unreachable from the labels
provided with the initial facts. It seems that the only analysis that
used this was proc-point analysis, which I think would always include
the entry block (so I don't think there's any bug due to this).
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: ./validate
Reviewers: bgamari, simonmar
Reviewed By: simonmar
Subscribers: rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4464
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This basically replaces all uses of `foldl` with `foldl'`. I've looked
at all the call sites and there doesn't seem to be any reason to prefer
the lazy version.
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: ./validate
Reviewers: bgamari, simonmar
Reviewed By: bgamari
Subscribers: rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4463
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During some recent work on CBE we discovered that `zipWith` is used to
check for equality, but that doesn't quite work if lists are of
different lengths! This was fixed by bgamari, but unfortunately the fix
had to be rolled back due to other changes in CBE in
50adbd7c5fe5894d3e6e2a58b353ed07e5f8949d. Since I wanted to have another
look at CBE anyway, we agreed that the first thing to do would be to
re-introduce the fix.
Sadly I don't have any actual test case that would exercise this.
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: ./validate
Reviewers: bgamari, simonmar
Reviewed By: bgamari
Subscribers: rwbarton, thomie, carter
GHC Trac Issues: #14226
Differential Revision: https://phabricator.haskell.org/D4387
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Trac #14226 noted that the C-- CBE pass frequently fails to
common up semantically identical blocks due to the differences in local
register naming. These patches fixed this by making the pass consider
equality up to alpha-renaming.
However, the new logic failed to consider the possibility that local
register naming *may* matter across multiple blocks. This lead to the
regression #14754. I'll need to do a bit of thinking on a proper
solution to this but in the meantime I'm reverting all four patches.
This reverts commit a27056f9823f8bbe2302f1924b3ab38fd6752e37.
This reverts commit 6f990c54f922beae80362fe62426beededc21290.
This reverts commit 9aa73892e10e90a1799b9277da593e816a827364.
This reverts commit 7920a7d9c53083b234e060a3e72f00b601a46808.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It seems that most uses of these folds should be strict left folds
(I could only find a single place that benefits from a right fold).
So this removes the existing `setFold`/`mapFold`/`mapFoldWihKey`
replaces them with:
- `setFoldl`/`mapFoldl`/`mapFoldlWithKey` (strict left folds)
- `setFoldr`/`mapFoldr` (for the less common case where a right fold
actually makes sense, e.g., `CmmProcPoint`)
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: ./validate
Reviewers: bgamari, simonmar
Reviewed By: bgamari
Subscribers: rwbarton, thomie, carter, kavon
Differential Revision: https://phabricator.haskell.org/D4356
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ensure that we don't consider lists of equal length to be equal when
they are not. I noticed these while working on the fix for #14361.
Reviewers: austin, simonmar, michalt
Reviewed By: michalt
Subscribers: rwbarton, thomie
GHC Trac Issues: #14361
Differential Revision: https://phabricator.haskell.org/D4153
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously CBE computed equality by taking the lists of middle nodes of
the blocks being compared and zipping them together. It would then map
over this list with the equality relation, and accumulate the result.
However, this is completely wrong: Consider what will happen when we
compare a block with no middle nodes with one with one or more. The
result of `zip` will be empty and consequently the pass may conclude
that the two are indeed equivalent (if their last nodes also match).
This is very bad and the cause of #14361.
The solution I chose was just to write out an explicit recursion, like I
distinctly recall considering doing when I first wrote this code.
Unfortunately I was feeling clever at the time.
Unfortunately this case was just rare enough not to be triggered by the
testsuite. I still need to find a testcase that doesn't have external
dependencies.
Test Plan: Need to find a more minimal testcase
Reviewers: austin, simonmar, michalt
Reviewed By: michalt
Subscribers: michalt, rwbarton, thomie, hvr
GHC Trac Issues: #14361
Differential Revision: https://phabricator.haskell.org/D4152
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Simonpj suggested this as a follow-on to #14226 to avoid code
duplication. This also gives us the ability to CBE cases involving
foreign calls for free.
Test Plan: Validate
Reviewers: austin, simonmar, simonpj
Reviewed By: simonpj
Subscribers: michalt, simonpj, rwbarton, thomie
GHC Trac Issues: #14226
Differential Revision: https://phabricator.haskell.org/D3999
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As noted in #14226, the common block elimination pass currently
implements an extremely strict equivalence relation, demanding that two
blocks are equivalent including the names of their local registers. This
is quite restrictive and severely hampers the effectiveness of the pass.
Here we allow the CBE pass to collapse blocks which are equivalent up to
alpha renaming of locally-bound local registers. This is completely safe
and catches many more duplicate blocks.
Test Plan: Validate
Reviewers: austin, simonmar, michalt
Reviewed By: michalt
Subscribers: rwbarton, thomie
GHC Trac Issues: #14226
Differential Revision: https://phabricator.haskell.org/D3973
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This switches the compiler/ component to get compiled with
-XNoImplicitPrelude and a `import GhcPrelude` is inserted in all
modules.
This is motivated by the upcoming "Prelude" re-export of
`Semigroup((<>))` which would cause lots of name clashes in every
modulewhich imports also `Outputable`
Reviewers: austin, goldfire, bgamari, alanz, simonmar
Reviewed By: bgamari
Subscribers: goldfire, rwbarton, thomie, mpickering, bgamari
Differential Revision: https://phabricator.haskell.org/D3989
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This copies the subset of Hoopl's functionality needed by GHC to
`cmm/Hoopl` and removes the dependency on the Hoopl package.
The main motivation for this change is the confusing/noisy interface
between GHC and Hoopl:
- Hoopl has `Label` which is GHC's `BlockId` but different than
GHC's `CLabel`
- Hoopl has `Unique` which is different than GHC's `Unique`
- Hoopl has `Unique{Map,Set}` which are different than GHC's
`Uniq{FM,Set}`
- GHC has its own specialized copy of `Dataflow`, so `cmm/Hoopl` is
needed just to filter the exposed functions (filter out some of the
Hoopl's and add the GHC ones)
With this change, we'll be able to simplify this significantly.
It'll also be much easier to do invasive changes (Hoopl is a public
package on Hackage with users that depend on the current behavior)
This should introduce no changes in functionality - it merely
copies the relevant code.
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: ./validate
Reviewers: austin, bgamari, simonmar
Reviewed By: bgamari, simonmar
Subscribers: simonpj, kavon, rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D3616
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We don't want unwind information to affect the code we produce.
Consequently we need to ensure that CBE ignores unwind nodes for the
purposes of equality.
Test Plan: Validate
Reviewers: scpmw, simonmar, austin
Reviewed By: simonmar
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2739
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This continues removal of `BlockId` module in favor of Hoopl's `Label`.
Most of the changes here are mechanical, apart from the orphan
`Outputable` instances for `LabelMap` and `LabelSet`. For now I've
moved them to `cmm/Hoopl`, since it's already trying to manage all
imports from Hoopl (to avoid any collisions).
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: validate
Reviewers: bgamari, austin, simonmar
Reviewed By: simonmar
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2800
|
|
|
|
|
|
|
|
|
| |
Bit-for-bit reproducible binaries are not a goal for now,
so this is just marking places that could be a problem.
Doing this will allow eltsUFM to be removed and will
leave only nonDetEltsUFM.
GHC Trac: #4012
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ord Unique can be a source of invisible, accidental
nondeterminism as explained in Note [No Ord for Unique].
This removes it, leaving a note with rationale.
It's unfortunate that I had to write Ord instances for
codegen data structures by hand, but I believe that it's a
right trade-off here.
Test Plan: ./validate
Reviewers: simonmar, austin, bgamari
Reviewed By: simonmar
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2370
GHC Trac Issues: #4012
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This allows the code generator to give hints to later code generation
steps about which branch is most likely to be taken. Right now it
is only taken into account in one place: a special case in
CmmContFlowOpt that swapped branches over to maximise the chance of
fallthrough, which is now disabled when there is a likelihood setting.
Test Plan: validate
Reviewers: austin, simonpj, bgamari, ezyang, tibbe
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D1273
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the hash function used to cut down the number of block
comparisons did not take local registers into account, causing far too
many similar, but different bocks to be considered candidates for the
(expensive!) comparision.
Adding register to the hash takes CmmCommonBlockElim's share of the
runtime of the example in #10397 from 17% to 2.5%, and eliminates all
unwanted hash collisions.
This patch also replaces the fancy trie by a plain Data.Map. It turned
out to be not performance critical, so this simplifies the code.
Differential Revision: https://phabricator.haskell.org/D896
|
|
|
|
|
|
|
|
|
| |
This is an attempt to improve the situation described in #10397, where
the linear scan of possible candidates for commoning up is far too
expensive. There is (ever) more room for improvement, but this is a
start.
Differential Revision: https://phabricator.haskell.org/D892
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This re-implements the code generation for case expressions at the Stg →
Cmm level, both for data type cases as well as for integral literal
cases. (Cases on float are still treated as before).
The goal is to allow for fancier strategies in implementing them, for a
cleaner separation of the strategy from the gritty details of Cmm, and
to run this later than the Common Block Optimization, allowing for one
way to attack #10124. The new module CmmSwitch contains a number of
notes explaining this changes. For example, it creates larger
consecutive jump tables than the previous code, if possible.
nofib shows little significant overall improvement of runtime. The
rather large wobbling comes from changes in the code block order
(see #8082, not much we can do about it). But the decrease in code size
alone makes this worthwhile.
```
Program Size Allocs Runtime Elapsed TotalMem
Min -1.8% 0.0% -6.1% -6.1% -2.9%
Max -0.7% +0.0% +5.6% +5.7% +7.8%
Geometric Mean -1.4% -0.0% -0.3% -0.3% +0.0%
```
Compilation time increases slightly:
```
-1 s.d. ----- -2.0%
+1 s.d. ----- +2.5%
Average ----- +0.3%
```
The test case T783 regresses a lot, but it is the only one exhibiting
any regression. The cause is the changed order of branches in an
if-then-else tree, which makes the hoople data flow analysis traverse
the blocks in a suboptimal order. Reverting that gets rid of this
regression, but has a consistent, if only very small (+0.2%), negative
effect on runtime. So I conclude that this test is an extreme outlier
and no reason to change the code.
Differential Revision: https://phabricator.haskell.org/D720
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unwind information allows the debugger to discover more information
about a program state, by allowing it to "reconstruct" other states of
the program. In practice, this means that we explain to the debugger
how to unravel stack frames, which comes down mostly to explaining how
to find their Sp and Ip register values.
* We declare yet another new constructor for CmmNode - and this time
there's actually little choice, as unwind information can and will
change mid-block. We don't actually make use of these capabilities,
and back-end support would be tricky (generate new labels?), but it
feels like the right way to do it.
* Even though we only use it for Sp so far, we allow CmmUnwind to specify
unwind information for any register. This is pretty cheap and could
come in useful in future.
* We allow full CmmExpr expressions for specifying unwind values. The
advantage here is that we don't have to make up new syntax, and can e.g.
use the WDS macro directly. On the other hand, the back-end will now
have to simplify the expression until it can sensibly be converted
into DWARF byte code - a process which might fail, yielding NCG panics.
On the other hand, when you're writing Cmm by hand you really ought to
know what you're doing.
(From Phabricator D169)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch solves the scoping problem of CmmTick nodes: If we just put
CmmTicks into blocks we have no idea what exactly they are meant to
cover. Here we introduce tick scopes, which allow us to create
sub-scopes and merged scopes easily.
Notes:
* Given that the code often passes Cmm around "head-less", we have to
make sure that its intended scope does not get lost. To keep the amount
of passing-around to a minimum we define a CmmAGraphScoped type synonym
here that just bundles the scope with a portion of Cmm to be assembled
later.
* We introduce new scopes at somewhat random places, aligning with
getCode calls. This works surprisingly well, but we might have to
add new scopes into the mix later on if we find things too be too
coarse-grained.
(From Phabricator D169)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds CmmTick nodes to Cmm code. This is relatively
straight-forward, but also not very useful, as many blocks will simply
end up with no annotations whatosever.
Notes:
* We use this design over, say, putting ticks into the entry node of all
blocks, as it seems to work better alongside existing optimisations.
Now granted, the reason for this is that currently GHC's main Cmm
optimisations seem to mainly reorganize and merge code, so this might
change in the future.
* We have the Cmm parser generate a few source notes as well. This is
relatively easy to do - worst part is that it complicates the CmmParse
implementation a bit.
(From Phabricator D169)
|
|
|
|
|
|
|
| |
We weren't properly tracking the number of stack arguments in the
continuation of a foreign call. It happened to work when the
continuation was not a join point, but when it was a join point we
were using the wrong amount of stack fixup.
|
| |
|
| |
|
|
|
|
|
|
| |
This gives the register allocator access to R1.., F1.., D1.. etc. for
the new code generator, and is a cheap way to eliminate all the extra
"x = R1" assignments that we get from copyIn.
|
| |
|
|
|
|
|
|
|
|
| |
We need to compare middle nodes and expressions modulo the BlockId
mapping too, because there are references to BlockIds in CmmStackSlot
and CmmBlock. This lets us catch more common blocks - in particular
we can share the heap-check fail code between multiple case
alternatives, which is most cool.
|
|
|
|
|
| |
When we had more than two identical blocks, we weren't eliminating all
the duplicates properly.
|
| |
|
| |
|
| |
|
| |
|
|
This changes the new code generator to make use of the Hoopl package
for dataflow analysis. Hoopl is a new boot package, and is maintained
in a separate upstream git repository (as usual, GHC has its own
lagging darcs mirror in http://darcs.haskell.org/packages/hoopl).
During this merge I squashed recent history into one patch. I tried
to rebase, but the history had some internal conflicts of its own
which made rebase extremely confusing, so I gave up. The history I
squashed was:
- Update new codegen to work with latest Hoopl
- Add some notes on new code gen to cmm-notes
- Enable Hoopl lag package.
- Add SPJ note to cmm-notes
- Improve GC calls on new code generator.
Work in this branch was done by:
- Milan Straka <fox@ucw.cz>
- John Dias <dias@cs.tufts.edu>
- David Terei <davidterei@gmail.com>
Edward Z. Yang <ezyang@mit.edu> merged in further changes from GHC HEAD
and fixed a few bugs.
|