| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
This is the right thing to do anyway, and fixes Trac #10528
|
|
|
|
|
|
|
|
| |
This splits filterAlts into two:
- filterAlts
- refineDefaultAlt
No change in functionality
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was a nasty, long-standing bug exposed in Trac #10538.
Symptoms were that we had an empty case
case (x :: Either a) of {}
Core Lint correctly picked this bogus code up.
Here is what happened
* In SimplUtils.prepareAlts, we call
filterAlts
then
combineIdenticalAlts
* We had case x of { Left _ -> e1; Right _ -> e1 }
* filterAlts did nothing, but correctly retuned imposs_deflt_cons
saying that 'x' cannot be {Left, Right} in the DEFAULT branch,
if any (there isn't one.)
* combineIdentialAlts correctly combines the identical alts, to give
case x of { DEFAULT -> e1 }
* BUT combineIdenticalAlts did no adjust imposs_deft_cons
* Result: when compiling e1 we did so in the belief that 'x'
could not be {Left,Right}. Disaster.
Easily fixed.
(It is hard to trigger; I can't construct a simple test case.)
|
|
|
|
| |
Just refactoring
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This long-standing, terrible, adn somewhat subtle bug was exposed
by Trac #10370, thanks to Reid Barton's brilliant test case (comment:3).
The effect is large on the Trac #10370 test.
Here is what the profile report says:
Before:
total time = 24.35 secs (24353 ticks @ 1000 us, 1 processor)
total alloc = 11,864,360,816 bytes (excludes profiling overheads)
After:
total time = 21.16 secs (21160 ticks @ 1000 us, 1 processor)
total alloc = 7,947,141,136 bytes (excludes profiling overheads)
The /combined/ effect of the tidyOccName fix, plus this one, is dramtic
for Trac #10370. Here is what +RTS -s says:
Before:
15,490,210,952 bytes allocated in the heap
1,783,919,456 bytes maximum residency (20 sample(s))
MUT time 30.117s ( 31.383s elapsed)
GC time 90.103s ( 90.107s elapsed)
Total time 120.843s (122.065s elapsed)
After:
7,928,671,936 bytes allocated in the heap
52,914,832 bytes maximum residency (25 sample(s))
MUT time 13.912s ( 15.110s elapsed)
GC time 6.809s ( 6.808s elapsed)
Total time 20.789s ( 21.954s elapsed)
- Heap allocation halved
- Residency cut by a factor of more than 30.
- ELapsed time cut by a factor of 6
Not bad!
The details
~~~~~~~~~~~
The culprit was SimplEnv.mkCoreSubst, which used mapVarEnv to do some
impedence-matching from the substitituion used by the simplifier to
the one used by CoreSubst. But the impedence-mactching was recursive!
mk_subst tv_env cv_env id_env
= CoreSubst.mkSubst in_scope tv_env cv_env (mapVarEnv fiddle id_env)
fiddle (DoneEx e) = e
fiddle (DoneId v) = Var v
fiddle (ContEx tv cv id e) = CoreSubst.substExpr (mk_subst tv cv id) e
Inside fiddle, in the ContEx case, we may do another whole level of
fiddle. And so on. Moreover, UniqFM (which is built on Data.IntMap) is
strict, so the fiddling is done eagerly. I didn't wok through all the
details but the result is a gargatuan blow-up of entirely unnecessary work.
Laziness would make this go away, I think, but I don't want to mess
with IntMap. And in any case, the impedence matching is a royal pain.
In the end I simply ceased trying to use CoreSubst.substExpr in the
simplifier, and instead just use simplExpr. That does mean bit of
duplication; e.g. new code for simplRules. But it's not a big deal
and it's far more direct and easy to reason about.
A bit of knock-on refactoring:
* Data type ArgSummary moves to CoreUnfold.
* interestingArg moves from CoreUnfold to SimplUtils, and gets a
SimplEnv argument which can be used when we encounter a variable.
* simplLamBndrs, addBndrRules move from SimplEnv to Simplify
(because they now calls simplUnfolding, simplRules resp)
* SimplUtils.substExpr, substUnfolding, mkCoreSubst die completely
* In Simplify some several functions that were previously pure
substitution-based functions are now monadic:
- addBndrRules, simplRule
- addCoerce, add_coerce in simplCast
* In case 2c of Simplify.rebuildCase, there was a pretty disgusting
expression-substitution taking place for 'rhs'; and we really don't
want to make that monadic becuase 'rhs' can be big.
Solution: reduce the arity of the rules for seq.
See Note [User-defined RULES for seq] in MkId.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: It looks like during .lhs -> .hs switch the comments were not updated. So doing exactly that.
Reviewers: austin, jstolarek, hvr, goldfire
Reviewed By: austin, jstolarek
Subscribers: thomie, goldfire
Differential Revision: https://phabricator.haskell.org/D621
GHC Trac Issues: #9986
|
|
|
|
|
|
|
|
| |
As with stripTicksTop, this is because we often need the stripped
expression but not the ticks (at least not right away). This makes a big
difference for CSE, see #9961.
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduces "SourceNote" tickishs that link Core to the
source code that generated it. The idea is to retain these source code
links throughout code transformations so we can eventually relate
object code all the way back to the original source (which we can,
say, encode as DWARF information to allow debugging). We generate
these SourceNotes like other tickshs in the desugaring phase. The
activating command line flag is "-g", consistent with the flag other
compilers use to decide DWARF generation.
Keeping ticks from getting into the way of Core transformations is
tricky, but doable. The changes in this patch produce identical Core
in all cases I tested -- which at this point is GHC, all libraries and
nofib. Also note that this pass creates *lots* of tick nodes, which we
reduce somewhat by removing duplicated and overlapping source
ticks. This will still cause significant Tick "clumps" - a possible
future optimization could be to make Tick carry a list of Tickishs
instead of one at a time.
(From Phabricator D169)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The issue was that contInputType simply gave the wrong answer
for type applications.
There was no way to fix contInputType; it just didn't have enough
information. So I did this:
* Split the ApplyTo constructor of SimplUtils.SimplCont into
ApplyToVal
ApplyToTy
I used record syntax for them; we should do this for some
of the other constructors too.
* The latter carries a sc_hole_ty, which is the type of the
continuation's "hole"
* Maintaining this type meant that I had do to something
similar for SimplUtils.ArgSpec.
* I renamed contInputType to contHoleType for consistency.
* I did a bit of refactoring around the call to tryRules
in Simplify, which was jolly confusing before.
The resulting code is quite nice now. And it has the additional
merit that it works.
The tests are simply tc124 and T7891 with -O enabled.
|
|
Signed-off-by: Austin Seipp <austin@well-typed.com>
|