| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
It's not obvious why the simplifier generates code that correctly satisfies
the let/app invariant. This patch does some minor refactoring, but the main
point is to document pre-conditions to key functions, namely that the rhs
passed in satisfies the let/app invariant.
There shouldn't be any change in behaviour.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In some cases, the layout of the LANGUAGE/OPTIONS_GHC lines has been
reorganized, while following the convention, to
- place `{-# LANGUAGE #-}` pragmas at the top of the source file, before
any `{-# OPTIONS_GHC #-}`-lines.
- Moreover, if the list of language extensions fit into a single
`{-# LANGUAGE ... -#}`-line (shorter than 80 characters), keep it on one
line. Otherwise split into `{-# LANGUAGE ... -#}`-lines for each
individual language extension. In both cases, try to keep the
enumeration alphabetically ordered.
(The latter layout is preferable as it's more diff-friendly)
While at it, this also replaces obsolete `{-# OPTIONS ... #-}` pragma
occurences by `{-# OPTIONS_GHC ... #-}` pragmas.
|
|
|
|
| |
I'd still prefer if a native english speaker would check them.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
a right-hand side:
Note [Float when cheap or expandable]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We want to float a let from a let if the residual RHS is
a) cheap, such as (\x. blah)
b) expandable, such as (f b) if f is CONLIKE
But there are
- cheap things that are not expandable (eg \x. expensive)
- expandable things that are not cheap (eg (f b) where b is CONLIKE)
so we must take the 'or' of the two.
|
| |
|
|
|
|
|
| |
By using Haskell's debugIsOn rather than CPP's "#ifdef DEBUG", we
don't need to kludge things to keep the warning checker happy etc.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Two changes here
* The main change here is to enhance the FloatIn pass so that it can
float case-bindings inwards. In particular the case bindings for
array indexing.
* Also change the code in Simplify, to allow a case on array
indexing (ie can_fail is true) to be discarded altogether if its
results are unused.
Lots of new comments in PrimOp about can_fail and has_side_effects
Some refactoring to share the FloatBind data structure between
FloatIn and FloatOut
|
|
|
|
| |
Fixes some core-lint errors when compiling with profiling
|
|
|
|
|
| |
We only use it for "compiler" sources, i.e. not for libraries.
Many modules have a -fno-warn-tabs kludge for now.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
User visible changes
====================
Profilng
--------
Flags renamed (the old ones are still accepted for now):
OLD NEW
--------- ------------
-auto-all -fprof-auto
-auto -fprof-exported
-caf-all -fprof-cafs
New flags:
-fprof-auto Annotates all bindings (not just top-level
ones) with SCCs
-fprof-top Annotates just top-level bindings with SCCs
-fprof-exported Annotates just exported bindings with SCCs
-fprof-no-count-entries Do not maintain entry counts when profiling
(can make profiled code go faster; useful with
heap profiling where entry counts are not used)
Cost-centre stacks have a new semantics, which should in most cases
result in more useful and intuitive profiles. If you find this not to
be the case, please let me know. This is the area where I have been
experimenting most, and the current solution is probably not the
final version, however it does address all the outstanding bugs and
seems to be better than GHC 7.2.
Stack traces
------------
+RTS -xc now gives more information. If the exception originates from
a CAF (as is common, because GHC tends to lift exceptions out to the
top-level), then the RTS walks up the stack and reports the stack in
the enclosing update frame(s).
Result: +RTS -xc is much more useful now - but you still have to
compile for profiling to get it. I've played around a little with
adding 'head []' to GHC itself, and +RTS -xc does pinpoint the problem
quite accurately.
I plan to add more facilities for stack tracing (e.g. in GHCi) in the
future.
Coverage (HPC)
--------------
* derived instances are now coloured yellow if they weren't used
* likewise record field names
* entry counts are more accurate (hpc --fun-entry-count)
* tab width is now correct (markup was previously off in source with
tabs)
Internal changes
================
In Core, the Note constructor has been replaced by
Tick (Tickish b) (Expr b)
which is used to represent all the kinds of source annotation we
support: profiling SCCs, HPC ticks, and GHCi breakpoints.
Depending on the properties of the Tickish, different transformations
apply to Tick. See CoreUtils.mkTick for details.
Tickets
=======
This commit closes the following tickets, test cases to follow:
- Close #2552: not a bug, but the behaviour is now more intuitive
(test is T2552)
- Close #680 (test is T680)
- Close #1531 (test is result001)
- Close #949 (test is T949)
- Close #2466: test case has bitrotted (doesn't compile against current
version of vector-space package)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A "lifting substitition" takes a *type* to a *coercion*, using a
substitution that takes a *type variable* to a *coercion*. We were
using a CvSubst for this purpose, which was an awkward exception: in
every other use of CvSubst, type variables map only to types.
Turned out that Coercion.liftCoSubst is quite a small function, so I
rewrote it with a special substitution type Coercion.LiftCoSubst, just
for that purpose. In doing so I found that the function itself was
bizarrely over-complicated ... a direct result of mis-using CvSubst.
So this patch makes it all simpler, faster, and easier to understand.
No bugs fixed though!
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Fix bugs in the packing and unpacking of data
constructors with equality predicates in their types
* Remove PredCo altogether; instead, coercions between predicated
types (like (Eq a, [a]~b) => blah) are treated as if they
were precisely their underlying representation type
Eq a -> ((~) [a] b) -> blah
in this case
* Similarly, Type.coreView no longer treats equality
predciates specially.
* Implement the cast-of-coercion optimisation in
Simplify.simplCoercionF
Numerous other small bug-fixes and refactorings.
Annoyingly, OptCoercion had Windows line endings, and this
patch switches to Unix, so it looks as if every line has changed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
See the paper "Practical aspects of evidence based compilation in System FC"
* Coercion becomes a data type, distinct from Type
* Coercions become value-level things, rather than type-level things,
(although the value is zero bits wide, like the State token)
A consequence is that a coerion abstraction increases the arity by 1
(just like a dictionary abstraction)
* There is a new constructor in CoreExpr, namely Coercion, to inject
coercions into terms
|
|
|
|
| |
This patch removes the Lint test, and comments why
|
|
|
|
| |
See Note [WildCard binders] in SimplEnv. Spotted by Roman.
|
|
|
|
|
|
|
|
|
|
|
| |
Principally, the SimplifierMode now carries several (currently
four) flags in *all* phases, not just the "Gentle" phase.
This makes things simpler and more uniform.
As usual I did more refactoring than I had intended.
This stuff should go into 7.0.2 in due course, once
we've checked it solves the DPH performance problems.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. Do eta-expansion at let-bindings, not lambdas.
I have wanted to do this for a long time.
See Note [Eta-expanding at let bindings] in SimplUtils
2. Simplify the rather subtle way in which InlineRules (the
template captured by an INLINE pragma) was simplified.
Now, these templates are always simplified in "gentle"
mode only, and only INLINE things inline inside them.
See Note Note [Gentle mode], Note [Inlining in gentle mode]
and Note [RULEs enabled in SimplGently] in SimplUtils
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"Short-cut" substitution means "do nothing if the substitution
is empty". We *never* want do to that in the simplifier because
even though the substitution is empty, the in-scope set has
useful information:
* We get up-to-date unfoldings; and that in turn may
reduce the number of iterations of the simplifier
* We avoid space leaks, because failing to substitute may
hang on to old Ids from a previous iteration
(This is what was causing the late inlining of foo in
Trac #4428.)
|
|
|
|
|
|
|
|
|
| |
This major patch implements the new OutsideIn constraint solving
algorithm in the typecheker, following our JFP paper "Modular type
inference with local assumptions".
Done with major help from Dimitrios Vytiniotis and Brent Yorgey.
|
|
|
|
|
|
|
|
| |
* I was debugging so I added some call-site info
(that touches a lot of code)
* I used substExpr a bit less in Simplify, hoping to
make the simplifier a little faster and cleaner
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch moves a lot of code around, but has zero functionality change.
The idea is that the types
CoreToDo
SimplifierSwitch
SimplifierMode
FloatOutSwitches
and
the main core-to-core pipeline construction
belong in simplCore/, and *not* in DynFlags.
|
|
|
|
|
|
|
|
|
|
|
| |
* Fix a bug that meant that
(right (inst (forall tv.co) ty))
wasn't getting optimised. This showed up in the
compiled code for ByteCodeItbls
* Add a substitution to optCoercion, so that it simultaneously
substitutes and optimises. Both call sites wanted this, and
optCoercion itself can use it, so it seems a win all round.
|
|
|
|
|
|
| |
See Note [RULEs apply to simplified arguments] in Simplify.lhs
A knock-on effect is that rules apply *after* we try inlining
(which uses un-simplified arguments), but that seems fine.
|
|
|
|
|
| |
The main change is using SimplUtils.updModeForInlineRules
doesn't overwrite the current setting, it just augments it.
|
|
|
|
|
|
|
| |
This change helps to break the mutual recursion generated by
an instance declaration.
See Note [Gentle mode] in SimplUtils
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch has been a long time in gestation and has, as a
result, accumulated some extra bits and bobs that are only
loosely related. I separated the bits that are easy to split
off, but the rest comes as one big patch, I'm afraid.
Note that:
* It comes together with a patch to the 'base' library
* Interface file formats change slightly, so you need to
recompile all libraries
The patch is mainly giant tidy-up, driven in part by the
particular stresses of the Data Parallel Haskell project. I don't
expect a big performance win for random programs. Still, here are the
nofib results, relative to the state of affairs without the patch
Program Size Allocs Runtime Elapsed
--------------------------------------------------------------------------------
Min -12.7% -14.5% -17.5% -17.8%
Max +4.7% +10.9% +9.1% +8.4%
Geometric Mean +0.9% -0.1% -5.6% -7.3%
The +10.9% allocation outlier is rewrite, which happens to have a
very delicate optimisation opportunity involving an interaction
of CSE and inlining (see nofib/Simon-nofib-notes). The fact that
the 'before' case found the optimisation is somewhat accidental.
Runtimes seem to go down, but I never kno wwhether to really trust
this number. Binary sizes wobble a bit, but nothing drastic.
The Main Ideas are as follows.
InlineRules
~~~~~~~~~~~
When you say
{-# INLINE f #-}
f x = <rhs>
you intend that calls (f e) are replaced by <rhs>[e/x] So we
should capture (\x.<rhs>) in the Unfolding of 'f', and never meddle
with it. Meanwhile, we can optimise <rhs> to our heart's content,
leaving the original unfolding intact in Unfolding of 'f'.
So the representation of an Unfolding has changed quite a bit
(see CoreSyn). An INLINE pragma gives rise to an InlineRule
unfolding.
Moreover, it's only used when 'f' is applied to the
specified number of arguments; that is, the number of argument on
the LHS of the '=' sign in the original source definition.
For example, (.) is now defined in the libraries like this
{-# INLINE (.) #-}
(.) f g = \x -> f (g x)
so that it'll inline when applied to two arguments. If 'x' appeared
on the left, thus
(.) f g x = f (g x)
it'd only inline when applied to three arguments. This slightly-experimental
change was requested by Roman, but it seems to make sense.
Other associated changes
* Moving the deck chairs in DsBinds, which processes the INLINE pragmas
* In the old system an INLINE pragma made the RHS look like
(Note InlineMe <rhs>)
The Note switched off optimisation in <rhs>. But it was quite
fragile in corner cases. The new system is more robust, I believe.
In any case, the InlineMe note has disappeared
* The workerInfo of an Id has also been combined into its Unfolding,
so it's no longer a separate field of the IdInfo.
* Many changes in CoreUnfold, esp in callSiteInline, which is the critical
function that decides which function to inline. Lots of comments added!
* exprIsConApp_maybe has moved to CoreUnfold, since it's so strongly
associated with "does this expression unfold to a constructor application".
It can now do some limited beta reduction too, which Roman found
was an important.
Instance declarations
~~~~~~~~~~~~~~~~~~~~~
It's always been tricky to get the dfuns generated from instance
declarations to work out well. This is particularly important in
the Data Parallel Haskell project, and I'm now on my fourth attempt,
more or less.
There is a detailed description in TcInstDcls, particularly in
Note [How instance declarations are translated]. Roughly speaking
we now generate a top-level helper function for every method definition
in an instance declaration, so that the dfun takes a particularly
stylised form:
dfun a d1 d2 = MkD (op1 a d1 d2) (op2 a d1 d2) ...etc...
In fact, it's *so* stylised that we never need to unfold a dfun.
Instead ClassOps have a special rewrite rule that allows us to
short-cut dictionary selection. Suppose dfun :: Ord a -> Ord [a]
d :: Ord a
Then
compare (dfun a d) --> compare_list a d
in one rewrite, without first inlining the 'compare' selector
and the body of the dfun.
To support this
a) ClassOps have a BuiltInRule (see MkId.dictSelRule)
b) DFuns have a special form of unfolding (CoreSyn.DFunUnfolding)
which is exploited in CoreUnfold.exprIsConApp_maybe
Implmenting all this required a root-and-branch rework of TcInstDcls
and bits of TcClassDcl.
Default methods
~~~~~~~~~~~~~~~
If you give an INLINE pragma to a default method, it should be just
as if you'd written out that code in each instance declaration, including
the INLINE pragma. I think that it now *is* so. As a result, library
code can be simpler; less duplication.
The CONLIKE pragma
~~~~~~~~~~~~~~~~~~
In the DPH project, Roman found cases where he had
p n k = let x = replicate n k
in ...(f x)...(g x)....
{-# RULE f (replicate x) = f_rep x #-}
Normally the RULE would not fire, because doing so involves
(in effect) duplicating the redex (replicate n k). A new
experimental modifier to the INLINE pragma, {-# INLINE CONLIKE
replicate #-}, allows you to tell GHC to be prepared to duplicate
a call of this function if it allows a RULE to fire.
See Note [CONLIKE pragma] in BasicTypes
Join points
~~~~~~~~~~~
See Note [Case binders and join points] in Simplify
Other refactoring
~~~~~~~~~~~~~~~~~
* I moved endPass from CoreLint to CoreMonad, with associated jigglings
* Better pretty-printing of Core
* The top-level RULES (ones that are not rules for locally-defined things)
are now substituted on every simplifier iteration. I'm not sure how
we got away without doing this before. This entails a bit more plumbing
in SimplCore.
* The necessary stuff to serialise and deserialise the new
info across interface files.
* Something about bottoming floats in SetLevels
Note [Bottoming floats]
* substUnfolding has moved from SimplEnv to CoreSubs, where it belongs
--------------------------------------------------------------------------------
Program Size Allocs Runtime Elapsed
--------------------------------------------------------------------------------
anna +2.4% -0.5% 0.16 0.17
ansi +2.6% -0.1% 0.00 0.00
atom -3.8% -0.0% -1.0% -2.5%
awards +3.0% +0.7% 0.00 0.00
banner +3.3% -0.0% 0.00 0.00
bernouilli +2.7% +0.0% -4.6% -6.9%
boyer +2.6% +0.0% 0.06 0.07
boyer2 +4.4% +0.2% 0.01 0.01
bspt +3.2% +9.6% 0.02 0.02
cacheprof +1.4% -1.0% -12.2% -13.6%
calendar +2.7% -1.7% 0.00 0.00
cichelli +3.7% -0.0% 0.13 0.14
circsim +3.3% +0.0% -2.3% -9.9%
clausify +2.7% +0.0% 0.05 0.06
comp_lab_zift +2.6% -0.3% -7.2% -7.9%
compress +3.3% +0.0% -8.5% -9.6%
compress2 +3.6% +0.0% -15.1% -17.8%
constraints +2.7% -0.6% -10.0% -10.7%
cryptarithm1 +4.5% +0.0% -4.7% -5.7%
cryptarithm2 +4.3% -14.5% 0.02 0.02
cse +4.4% -0.0% 0.00 0.00
eliza +2.8% -0.1% 0.00 0.00
event +2.6% -0.0% -4.9% -4.4%
exp3_8 +2.8% +0.0% -4.5% -9.5%
expert +2.7% +0.3% 0.00 0.00
fem -2.0% +0.6% 0.04 0.04
fft -6.0% +1.8% 0.05 0.06
fft2 -4.8% +2.7% 0.13 0.14
fibheaps +2.6% -0.6% 0.05 0.05
fish +4.1% +0.0% 0.03 0.04
fluid -2.1% -0.2% 0.01 0.01
fulsom -4.8% +9.2% +9.1% +8.4%
gamteb -7.1% -1.3% 0.10 0.11
gcd +2.7% +0.0% 0.05 0.05
gen_regexps +3.9% -0.0% 0.00 0.00
genfft +2.7% -0.1% 0.05 0.06
gg -2.7% -0.1% 0.02 0.02
grep +3.2% -0.0% 0.00 0.00
hidden -0.5% +0.0% -11.9% -13.3%
hpg -3.0% -1.8% +0.0% -2.4%
ida +2.6% -1.2% 0.17 -9.0%
infer +1.7% -0.8% 0.08 0.09
integer +2.5% -0.0% -2.6% -2.2%
integrate -5.0% +0.0% -1.3% -2.9%
knights +4.3% -1.5% 0.01 0.01
lcss +2.5% -0.1% -7.5% -9.4%
life +4.2% +0.0% -3.1% -3.3%
lift +2.4% -3.2% 0.00 0.00
listcompr +4.0% -1.6% 0.16 0.17
listcopy +4.0% -1.4% 0.17 0.18
maillist +4.1% +0.1% 0.09 0.14
mandel +2.9% +0.0% 0.11 0.12
mandel2 +4.7% +0.0% 0.01 0.01
minimax +3.8% -0.0% 0.00 0.00
mkhprog +3.2% -4.2% 0.00 0.00
multiplier +2.5% -0.4% +0.7% -1.3%
nucleic2 -9.3% +0.0% 0.10 0.10
para +2.9% +0.1% -0.7% -1.2%
paraffins -10.4% +0.0% 0.20 -1.9%
parser +3.1% -0.0% 0.05 0.05
parstof +1.9% -0.0% 0.00 0.01
pic -2.8% -0.8% 0.01 0.02
power +2.1% +0.1% -8.5% -9.0%
pretty -12.7% +0.1% 0.00 0.00
primes +2.8% +0.0% 0.11 0.11
primetest +2.5% -0.0% -2.1% -3.1%
prolog +3.2% -7.2% 0.00 0.00
puzzle +4.1% +0.0% -3.5% -8.0%
queens +2.8% +0.0% 0.03 0.03
reptile +2.2% -2.2% 0.02 0.02
rewrite +3.1% +10.9% 0.03 0.03
rfib -5.2% +0.2% 0.03 0.03
rsa +2.6% +0.0% 0.05 0.06
scc +4.6% +0.4% 0.00 0.00
sched +2.7% +0.1% 0.03 0.03
scs -2.6% -0.9% -9.6% -11.6%
simple -4.0% +0.4% -14.6% -14.9%
solid -5.6% -0.6% -9.3% -14.3%
sorting +3.8% +0.0% 0.00 0.00
sphere -3.6% +8.5% 0.15 0.16
symalg -1.3% +0.2% 0.03 0.03
tak +2.7% +0.0% 0.02 0.02
transform +2.0% -2.9% -8.0% -8.8%
treejoin +3.1% +0.0% -17.5% -17.8%
typecheck +2.9% -0.3% -4.6% -6.6%
veritas +3.9% -0.3% 0.00 0.00
wang -6.2% +0.0% 0.18 -9.8%
wave4main -10.3% +2.6% -2.1% -2.3%
wheel-sieve1 +2.7% -0.0% +0.3% -0.6%
wheel-sieve2 +2.7% +0.0% -3.7% -7.5%
x2n1 -4.1% +0.1% 0.03 0.04
--------------------------------------------------------------------------------
Min -12.7% -14.5% -17.5% -17.8%
Max +4.7% +10.9% +9.1% +8.4%
Geometric Mean +0.9% -0.1% -5.6% -7.3%
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds an optional CONLIKE modifier to INLINE/NOINLINE pragmas,
{-# NOINLINE CONLIKE [1] f #-}
The effect is to allow applications of 'f' to be expanded in a potential
rule match. Example
{-# RULE "r/f" forall v. r (f v) = f (v+1) #-}
Consider the term
let x = f v in ..x...x...(r x)...
Normally the (r x) would not match the rule, because GHC would be scared
about duplicating the redex (f v). However the CONLIKE modifier says to
treat 'f' like a constructor in this situation, and "look through" the
unfolding for x. So (r x) fires, yielding (f (v+1)).
The main changes are:
- Syntax
- The inlinePragInfo field of an IdInfo has a RuleMatchInfo
component, which records whether or not the Id is CONLIKE.
Of course, this needs to be serialised in interface files too.
- The occurrence analyser (OccAnal) and simplifier (Simplify) treat
CONLIKE thing like constructors, by ANF-ing them
- New function coreUtils.exprIsExpandable is like exprIsCheap, but
additionally spots applications of CONLIKE functions
- A CoreUnfolding has a field that caches exprIsExpandable
- The rule matcher consults this field. See
Note [Expanding variables] in Rules.lhs.
On the way I fixed a lurking variable bug in the way variables are
expanded. See Note [Do not expand locally-bound variables] in
Rule.lhs. I also did a bit of reformatting and refactoring in
Rules.lhs, so the module has more lines changed than are really
different.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
rolling back:
Fri Dec 5 16:54:00 GMT 2008 simonpj@microsoft.com
* Completely new treatment of INLINE pragmas (big patch)
This is a major patch, which changes the way INLINE pragmas work.
Although lots of files are touched, the net is only +21 lines of
code -- and I bet that most of those are comments!
HEADS UP: interface file format has changed, so you'll need to
recompile everything.
There is not much effect on overall performance for nofib,
probably because those programs don't make heavy use of INLINE pragmas.
Program Size Allocs Runtime Elapsed
Min -11.3% -6.9% -9.2% -8.2%
Max -0.1% +4.6% +7.5% +8.9%
Geometric Mean -2.2% -0.2% -1.0% -0.8%
(The +4.6% for on allocs is cichelli; see other patch relating to
-fpass-case-bndr-to-join-points.)
The old INLINE system
~~~~~~~~~~~~~~~~~~~~~
The old system worked like this. A function with an INLINE pragam
got a right-hand side which looked like
f = __inline_me__ (\xy. e)
The __inline_me__ part was an InlineNote, and was treated specially
in various ways. Notably, the simplifier didn't inline inside an
__inline_me__ note.
As a result, the code for f itself was pretty crappy. That matters
if you say (map f xs), because then you execute the code for f,
rather than inlining a copy at the call site.
The new story: InlineRules
~~~~~~~~~~~~~~~~~~~~~~~~~~
The new system removes the InlineMe Note altogether. Instead there
is a new constructor InlineRule in CoreSyn.Unfolding. This is a
bit like a RULE, in that it remembers the template to be inlined inside
the InlineRule. No simplification or inlining is done on an InlineRule,
just like RULEs.
An Id can have an InlineRule *or* a CoreUnfolding (since these are two
constructors from Unfolding). The simplifier treats them differently:
- An InlineRule is has the substitution applied (like RULES) but
is otherwise left undisturbed.
- A CoreUnfolding is updated with the new RHS of the definition,
on each iteration of the simplifier.
An InlineRule fires regardless of size, but *only* when the function
is applied to enough arguments. The "arity" of the rule is specified
(by the programmer) as the number of args on the LHS of the "=". So
it makes a difference whether you say
{-# INLINE f #-}
f x = \y -> e or f x y = e
This is one of the big new features that InlineRule gives us, and it
is one that Roman really wanted.
In contrast, a CoreUnfolding can fire when it is applied to fewer
args than than the function has lambdas, provided the result is small
enough.
Consequential stuff
~~~~~~~~~~~~~~~~~~~
* A 'wrapper' no longer has a WrapperInfo in the IdInfo. Instead,
the InlineRule has a field identifying wrappers.
* Of course, IfaceSyn and interface serialisation changes appropriately.
* Making implication constraints inline nicely was a bit fiddly. In
the end I added a var_inline field to HsBInd.VarBind, which is why
this patch affects the type checker slightly
* I made some changes to the way in which eta expansion happens in
CorePrep, mainly to ensure that *arguments* that become let-bound
are also eta-expanded. I'm still not too happy with the clarity
and robustness fo the result.
* We now complain if the programmer gives an INLINE pragma for
a recursive function (prevsiously we just ignored it). Reason for
change: we don't want an InlineRule on a LoopBreaker, because then
we'd have to check for loop-breaker-hood at occurrence sites (which
isn't currenlty done). Some tests need changing as a result.
This patch has been in my tree for quite a while, so there are
probably some other minor changes.
M ./compiler/basicTypes/Id.lhs -11
M ./compiler/basicTypes/IdInfo.lhs -82
M ./compiler/basicTypes/MkId.lhs -2 +2
M ./compiler/coreSyn/CoreFVs.lhs -2 +25
M ./compiler/coreSyn/CoreLint.lhs -5 +1
M ./compiler/coreSyn/CorePrep.lhs -59 +53
M ./compiler/coreSyn/CoreSubst.lhs -22 +31
M ./compiler/coreSyn/CoreSyn.lhs -66 +92
M ./compiler/coreSyn/CoreUnfold.lhs -112 +112
M ./compiler/coreSyn/CoreUtils.lhs -185 +184
M ./compiler/coreSyn/MkExternalCore.lhs -1
M ./compiler/coreSyn/PprCore.lhs -4 +40
M ./compiler/deSugar/DsBinds.lhs -70 +118
M ./compiler/deSugar/DsForeign.lhs -2 +4
M ./compiler/deSugar/DsMeta.hs -4 +3
M ./compiler/hsSyn/HsBinds.lhs -3 +3
M ./compiler/hsSyn/HsUtils.lhs -2 +7
M ./compiler/iface/BinIface.hs -11 +25
M ./compiler/iface/IfaceSyn.lhs -13 +21
M ./compiler/iface/MkIface.lhs -24 +19
M ./compiler/iface/TcIface.lhs -29 +23
M ./compiler/main/TidyPgm.lhs -55 +49
M ./compiler/parser/ParserCore.y -5 +6
M ./compiler/simplCore/CSE.lhs -2 +1
M ./compiler/simplCore/FloatIn.lhs -6 +1
M ./compiler/simplCore/FloatOut.lhs -23
M ./compiler/simplCore/OccurAnal.lhs -36 +5
M ./compiler/simplCore/SetLevels.lhs -59 +54
M ./compiler/simplCore/SimplCore.lhs -48 +52
M ./compiler/simplCore/SimplEnv.lhs -26 +22
M ./compiler/simplCore/SimplUtils.lhs -28 +4
M ./compiler/simplCore/Simplify.lhs -91 +109
M ./compiler/specialise/Specialise.lhs -15 +18
M ./compiler/stranal/WorkWrap.lhs -14 +11
M ./compiler/stranal/WwLib.lhs -2 +2
M ./compiler/typecheck/Inst.lhs -1 +3
M ./compiler/typecheck/TcBinds.lhs -17 +27
M ./compiler/typecheck/TcClassDcl.lhs -1 +2
M ./compiler/typecheck/TcExpr.lhs -4 +6
M ./compiler/typecheck/TcForeign.lhs -1 +1
M ./compiler/typecheck/TcGenDeriv.lhs -14 +13
M ./compiler/typecheck/TcHsSyn.lhs -3 +2
M ./compiler/typecheck/TcInstDcls.lhs -5 +4
M ./compiler/typecheck/TcRnDriver.lhs -2 +11
M ./compiler/typecheck/TcSimplify.lhs -10 +17
M ./compiler/vectorise/VectType.hs +7
Mon Dec 8 12:43:10 GMT 2008 simonpj@microsoft.com
* White space only
M ./compiler/simplCore/Simplify.lhs -2
Mon Dec 8 12:48:40 GMT 2008 simonpj@microsoft.com
* Move simpleOptExpr from CoreUnfold to CoreSubst
M ./compiler/coreSyn/CoreSubst.lhs -1 +87
M ./compiler/coreSyn/CoreUnfold.lhs -72 +1
Mon Dec 8 17:30:18 GMT 2008 simonpj@microsoft.com
* Use CoreSubst.simpleOptExpr in place of the ad-hoc simpleSubst (reduces code too)
M ./compiler/deSugar/DsBinds.lhs -50 +16
Tue Dec 9 17:03:02 GMT 2008 simonpj@microsoft.com
* Fix Trac #2861: bogus eta expansion
Urghlhl! I "tided up" the treatment of the "state hack" in CoreUtils, but
missed an unexpected interaction with the way that a bottoming function
simply swallows excess arguments. There's a long
Note [State hack and bottoming functions]
to explain (which accounts for most of the new lines of code).
M ./compiler/coreSyn/CoreUtils.lhs -16 +53
Mon Dec 15 10:02:21 GMT 2008 Simon Marlow <marlowsd@gmail.com>
* Revert CorePrep part of "Completely new treatment of INLINE pragmas..."
The original patch said:
* I made some changes to the way in which eta expansion happens in
CorePrep, mainly to ensure that *arguments* that become let-bound
are also eta-expanded. I'm still not too happy with the clarity
and robustness fo the result.
Unfortunately this change apparently broke some invariants that were
relied on elsewhere, and in particular lead to panics when compiling
with profiling on.
Will re-investigate in the new year.
M ./compiler/coreSyn/CorePrep.lhs -53 +58
M ./configure.ac -1 +1
Mon Dec 15 12:28:51 GMT 2008 Simon Marlow <marlowsd@gmail.com>
* revert accidental change to configure.ac
M ./configure.ac -1 +1
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a major patch, which changes the way INLINE pragmas work.
Although lots of files are touched, the net is only +21 lines of
code -- and I bet that most of those are comments!
HEADS UP: interface file format has changed, so you'll need to
recompile everything.
There is not much effect on overall performance for nofib,
probably because those programs don't make heavy use of INLINE pragmas.
Program Size Allocs Runtime Elapsed
Min -11.3% -6.9% -9.2% -8.2%
Max -0.1% +4.6% +7.5% +8.9%
Geometric Mean -2.2% -0.2% -1.0% -0.8%
(The +4.6% for on allocs is cichelli; see other patch relating to
-fpass-case-bndr-to-join-points.)
The old INLINE system
~~~~~~~~~~~~~~~~~~~~~
The old system worked like this. A function with an INLINE pragam
got a right-hand side which looked like
f = __inline_me__ (\xy. e)
The __inline_me__ part was an InlineNote, and was treated specially
in various ways. Notably, the simplifier didn't inline inside an
__inline_me__ note.
As a result, the code for f itself was pretty crappy. That matters
if you say (map f xs), because then you execute the code for f,
rather than inlining a copy at the call site.
The new story: InlineRules
~~~~~~~~~~~~~~~~~~~~~~~~~~
The new system removes the InlineMe Note altogether. Instead there
is a new constructor InlineRule in CoreSyn.Unfolding. This is a
bit like a RULE, in that it remembers the template to be inlined inside
the InlineRule. No simplification or inlining is done on an InlineRule,
just like RULEs.
An Id can have an InlineRule *or* a CoreUnfolding (since these are two
constructors from Unfolding). The simplifier treats them differently:
- An InlineRule is has the substitution applied (like RULES) but
is otherwise left undisturbed.
- A CoreUnfolding is updated with the new RHS of the definition,
on each iteration of the simplifier.
An InlineRule fires regardless of size, but *only* when the function
is applied to enough arguments. The "arity" of the rule is specified
(by the programmer) as the number of args on the LHS of the "=". So
it makes a difference whether you say
{-# INLINE f #-}
f x = \y -> e or f x y = e
This is one of the big new features that InlineRule gives us, and it
is one that Roman really wanted.
In contrast, a CoreUnfolding can fire when it is applied to fewer
args than than the function has lambdas, provided the result is small
enough.
Consequential stuff
~~~~~~~~~~~~~~~~~~~
* A 'wrapper' no longer has a WrapperInfo in the IdInfo. Instead,
the InlineRule has a field identifying wrappers.
* Of course, IfaceSyn and interface serialisation changes appropriately.
* Making implication constraints inline nicely was a bit fiddly. In
the end I added a var_inline field to HsBInd.VarBind, which is why
this patch affects the type checker slightly
* I made some changes to the way in which eta expansion happens in
CorePrep, mainly to ensure that *arguments* that become let-bound
are also eta-expanded. I'm still not too happy with the clarity
and robustness fo the result.
* We now complain if the programmer gives an INLINE pragma for
a recursive function (prevsiously we just ignored it). Reason for
change: we don't want an InlineRule on a LoopBreaker, because then
we'd have to check for loop-breaker-hood at occurrence sites (which
isn't currenlty done). Some tests need changing as a result.
This patch has been in my tree for quite a while, so there are
probably some other minor changes.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch does a lot of tidying up of the way that dead variables are
handled in Core. Just the sort of thing to do on an aeroplane.
* The tricky "binder-swap" optimisation is moved from the Simplifier
to the Occurrence Analyser. See Note [Binder swap] in OccurAnal.
This is really a nice change. It should reduce the number of
simplifier iteratoins (slightly perhaps). And it means that
we can be much less pessimistic about zapping occurrence info
on binders in a case expression.
* For example:
case x of y { (a,b) -> e }
Previously, each time around, even if y,a,b were all dead, the
Simplifier would pessimistically zap their OccInfo, so that we
can't see they are dead any more. As a result virtually no
case expression ended up with dead binders. This wasn't Bad
in itself, but it always felt wrong.
* I added a check to CoreLint to check that a dead binder really
isn't used. That showed up a couple of bugs in CSE. (Only in
this sense -- they didn't really matter.)
* I've changed the PprCore printer to print "_" for a dead variable.
(Use -dppr-debug to see it again.) This reduces clutter quite a
bit, and of course it's much more useful with the above change.
* Another benefit of the binder-swap change is that I could get rid of
the Simplifier hack (working, but hacky) in which the InScopeSet was
used to map a variable to a *different* variable. That allowed me
to remove VarEnv.modifyInScopeSet, and to simplify lookupInScopeSet
so that it doesn't look for a fixpoint. This fixes no bugs, but
is a useful cleanup.
* Roman pointed out that Id.mkWildId is jolly dangerous, because
of its fixed unique. So I've
- localied it to MkCore, where it is private (not exported)
- renamed it to 'mkWildBinder' to stress that you should only
use it at binding sites, unless you really know what you are
doing
- provided a function MkCore.mkWildCase that emodies the most
common use of mkWildId, and use that elsewhere
So things are much better
* A knock-on change is that I found a common pattern of localising
a potentially global Id, and made a function for it: Id.localiseId
|
|
|
|
|
|
|
|
|
|
|
|
| |
I was perplexed about why an arity-related WARN was tripping. It took
me _day_ (sigh) to find that it was because SimplEnv.substExpr was taking
a short cut when the substitution was empty, thereby not subsituting for
Ids in scope, which must be done (CoreSubst Note [Extending the Subst]).
The fix is a matter of deleting the "optimisation". Same with
CoreSubst.substSpec, although I don't know if that actually caused a
probem.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch significantly improves the way in which recursive groups
are specialised. This turns out ot be very important when specilising
the bindings that (now) emerge from instance declarations.
Consider
let rec { f x = ...g x'...
; g y = ...f y'.... }
in f 'a'
Here we specialise 'f' at Char; but that is very likely to lead to
a specialisation of 'g' at Char. We must do the latter, else the
whole point of specialisation is lost. This was not happening before.
The whole thing is desribed in
Note [Specialising a recursive group]
Simon
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This bug was somehow tickled by the new code for desugaring
polymorphic bindings, but the bug has been there a long time. The
bindings floated out in simplLazyBind, generated by abstractFloats,
were getting processed by postInlineUnconditionally. But that was
wrong because part of their scope has already been processed.
That led to a bit of refactoring in the simplifier. See comments
with Simplify.addPolyBind.
In principle this might happen in 6.8.3, but in practice it doesn't seem
to, so probably not worth merging.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The main change in this patch is this:
* The Stop constructor of SimplCont no longer contains the OutType
of the whole continuation. This is a nice simplification in
lots of places where we build a Stop continuation. For example,
rebuildCall no longer needs to maintain the type of the function.
* Similarly StrictArg no longer needs an OutType
* The consequential complication is that contResultType (not called
much) needs to be given the type of the thing in the middle. No
big deal.
* Lots of other small knock-on effects
Other changes in here
* simplLazyBind does do the type-abstraction thing if there's
a lambda inside. See comments in simplLazyBind
* simplLazyBind reduces simplifier iterations by keeping
unfolding information for stuff for which type abstraction is
done (see add_poly_bind)
All of this came up when implementing System IF, but seems worth applying
to the HEAD
|
|
|
|
| |
Modules that need it import it themselves instead.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The add_evals code in Simplify.simplAlt had bit-rotted. Example:
data T a = T !a
data U a = U !a
foo :: T a -> U a
foo (T x) = U x
Here we should not evaluate x before building the U result, because
the x argument of T is already evaluated.
Thanks to Roman for finding this.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(No need to merge to 6.8, but no harm if a subsequent patch needs it.)
The proximate cause for this patch is to improve the inlining for INLINE
things that are not functions; this came up in the NDP project. See
Note [Lone variables] in CoreUnfold.
This caused some refactoring that actually made things simpler. In
particular, more of the inlining logic has moved from SimplUtils to
CoreUnfold, where it belongs.
|
|
|
|
|
|
|
|
|
| |
This patch is on the HEAD. It fixes a nasty and long-standing bug
whereby we weren't substituting the ru_fn field of a CoreRule in
CoreSubst.substSpec, which ultimately led to a puzzling "nameModule"
error trying to put the rules in the interface file.
|
|
|
|
|
| |
This fix avoids a bogus WARN in SimplEnv.substId
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I got a Core Lint failure when compiling System.Win32.Info in the
Win32 package. It was very delicate: adding or removing a function
definition elsewhere in the module (unrelated to the error) made the
error go away.
Happily, I found it. In SimplUtils.prepareDefault I was comparing an
InId with an OutId. We were getting a spurious hit, and hence doing
a bogus CaseMerge.
This bug has been lurking ever since I re-factored the way that case
expressions were simplified, about 6 months ago!
|
|
|
|
|
|
|
|
|
|
| |
Don't re-add the worker info to a binder until completeBind. It's not
needed in its own RHS, and it may be replaced, via the substitution
following postInlineUnconditionally.
(Fixes build of the stage2 compiler which fell over when Coercion.lhs
was being compiled.)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(Merge to 6.8 branch after testing.)
There were a number of delicate interactions between RULEs and inlining
in GHC 6.6. I've wanted to fix this for a long time, and some perf
problems in the 6.8 release candidate finally forced me over the edge!
The issues are documented extensively in OccurAnal, Note [Loop breaking
and RULES], and I won't duplicate them here. (Many of the extra lines in
OccurAnal are comments!)
This patch resolves Trac bugs #1709, #1794, #1763, I believe.
|