| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
This fixes Trac #4930. See Note [Bottom alternatives] in Simplify.lhs
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Trac #4908 identified a case where SpecConstr wasn't "seeing" a
specialisation it should easily get. The solution was simple: see
Note [Add scrutinee to ValueEnv too] in SpecConstr.
Then it turned out that there was an exactly analogous infelicity in
the mighty Simplifer too; see Note [Add unfolding for scrutinee] in
Simplify. This fix is good for Simplify even in the absence of the
SpecConstr change. (It arose when I moved the binder- swap stuff to
OccAnall, not realising that it *remains* valuable to record info
about the scrutinee of a case expression. The Note says why.
Together these two changes are unconditionally good. Better
simplification, better specialisation. Thank you Max.
|
|
|
|
|
| |
The main change here is to do with dropping redundant seqs.
See Note [exprOkForSpeculation: case expressions] in CoreUtils.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch finally deals with the super-delicate question of
superclases in possibly-recursive dictionaries. The key idea
is the DFun Superclass Invariant (see TcInstDcls):
In the body of a DFun, every superclass argument to the
returned dictionary is
either * one of the arguments of the DFun,
or * constant, bound at top level
To establish the invariant, we add new "silent" superclass
argument(s) to each dfun, so that the dfun does not do superclass
selection internally. There's a bit of hoo-ha to make sure that
we don't print those silent arguments in error messages; a knock
on effect was a change in interface-file format.
A second change is that instead of the complex and fragile
"self dictionary binding" in TcInstDcls and TcClassDcl,
using the same mechanism for existential pattern bindings.
See Note [Subtle interaction of recursion and overlap] in TcInstDcls
and Note [Binding when looking up instances] in InstEnv.
Main notes are here:
* Note [Silent Superclass Arguments] in TcInstDcls,
including the DFun Superclass Invariant
Main code changes are:
* The code for MkId.mkDictFunId and mkDictFunTy
* DFunUnfoldings get a little more complicated;
their arguments are a new type DFunArg (in CoreSyn)
* No "self" argument in tcInstanceMethod
* No special tcSimplifySuperClasss
* No "dependents" argument to EvDFunApp
IMPORTANT
It turns out that it's quite tricky to generate the right
DFunUnfolding for a specialised dfun, when you use SPECIALISE
INSTANCE. For now I've just commented it out (in DsBinds) but
that'll lose some optimisation, and I need to get back to
this.
|
|
|
|
|
| |
See Note [Case elimination: lifted case].
Thanks to Roman for identifying this case.
|
|
|
|
|
|
|
| |
Now, -ddump-rule-firings only shows the names of the rules that fired (it would
show "before" and "after" with -dverbose-core2core previously) and
-ddump-rule-rewrites always shows the "before" and "after" bits, even without
-dverbose-core2core.
|
|
|
|
|
|
|
|
|
|
|
| |
Principally, the SimplifierMode now carries several (currently
four) flags in *all* phases, not just the "Gentle" phase.
This makes things simpler and more uniform.
As usual I did more refactoring than I had intended.
This stuff should go into 7.0.2 in due course, once
we've checked it solves the DPH performance problems.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. Do eta-expansion at let-bindings, not lambdas.
I have wanted to do this for a long time.
See Note [Eta-expanding at let bindings] in SimplUtils
2. Simplify the rather subtle way in which InlineRules (the
template captured by an INLINE pragma) was simplified.
Now, these templates are always simplified in "gentle"
mode only, and only INLINE things inline inside them.
See Note Note [Gentle mode], Note [Inlining in gentle mode]
and Note [RULEs enabled in SimplGently] in SimplUtils
|
| |
|
|
|
|
|
|
| |
Debugged thanks to lots of help from Simon PJ: we weren't updating the
UnfoldingGuidance when the unfolding changed.
Also, a bit of refactoring and additinoal comments.
|
|
|
|
|
|
|
|
|
| |
When adding specialisation for imported Ids, I noticed that the
Glorious Simplifier was repeatedly (and fruitlessly) simplifying the
same term. It turned out to be easy to fix this, because I already
had a flag in the ApplyTo and Select constructors of SimplUtils.SimplCont.
See Note [Avoid redundant simplification]
|
| |
|
|
|
|
| |
Implements Trac #4299. Documentation to come.
|
| |
|
|
|
|
| |
and adjust imports accordingly
|
|
|
|
|
|
|
|
|
| |
This major patch implements the new OutsideIn constraint solving
algorithm in the typecheker, following our JFP paper "Modular type
inference with local assumptions".
Done with major help from Dimitrios Vytiniotis and Brent Yorgey.
|
|
|
|
|
|
|
|
|
|
|
|
| |
See Note [DFun unfoldings] in CoreSyn. The issue here is that
you can't tell how many dictionary arguments a DFun needs just
from looking at the Arity of the DFun Id: if the dictionary is
represented by a newtype the arity might include the dictionary
and value arguments of the (single) method.
So we need to record the number of arguments need by the DFun
in the DFunUnfolding itself. Details in
Note [DFun unfoldings] in CoreSyn
|
| |
|
|
|
|
|
|
|
| |
This is a long-standing lurking bug. See Note [Lamba-bound unfoldings]
in DmdAnal.
I'm still not really happy with this lambda-bound-unfolding stuff.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The simplifier is taking more iterations than it should, because we
were fruitlessly ANF-ing a top-level declaration of form
x = Ptr "foo"#
to get
x = let v = "foo"# in Ptr v
and then inlining v again. This patch makes Simplify.makeTrivial
top-level aware, so that it doesn't ANF if it's going to be undone.
|
|
|
|
|
|
|
|
|
|
| |
See the long Note [INLINE and default methods].
This patch changes a couple of data types, with a knock-on effect on
the format of interface files. A lot of files get touched, but is a
relatively minor change. The main tiresome bit is the extra plumbing
to communicate default methods between the type checker and the
desugarer.
|
|
|
|
|
|
|
|
| |
* I was debugging so I added some call-site info
(that touches a lot of code)
* I used substExpr a bit less in Simplify, hoping to
make the simplifier a little faster and cleaner
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The main purpose of this patch is to add a bunch of new rules
to the coercion optimiser. They are documented in the (revised)
Appendix of the System FC paper.
Some code has moved about:
- OptCoercion is now a separate module, mainly because it
now uses tcMatchTy, which is defined in Unify, so OptCoercion
must live higehr up in the hierarchy
- Functions that manipulate Kinds has moved from
Type.lhs to Coercion.lhs. Reason: the function typeKind
now needs to call coercionKind. And in any case, a Kind is
a flavour of Type, so it builds on top of Type; indeed Coercions
and Kinds are both flavours of Type.
This change required fiddling with a number of imports, hence
the one-line changes to otherwise-unrelated modules
- The representation of CoTyCons in TyCon has changed. Instead of
an extensional representation (a kind checker) there is now an
intensional representation (namely TyCon.CoTyConDesc). This was
needed for one of the new coercion optimisations.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch moves a lot of code around, but has zero functionality change.
The idea is that the types
CoreToDo
SimplifierSwitch
SimplifierMode
FloatOutSwitches
and
the main core-to-core pipeline construction
belong in simplCore/, and *not* in DynFlags.
|
|
|
|
|
|
| |
By default, these two now print *one line* per inlining or rule-firing.
If you want the previous (voluminous) behaviour, use -dverbose-core2core.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
InlineRules
This patch does two main things:
1. Adjusts the way we set the Activation for
a) The wrappers generated by the strictness analyser
See Note [Wrapper activation] in WorkWrap
b) The RULEs generated by Specialise and SpecConstr
See Note [Auto-specialisation and RULES] in Specialise
Note [Transfer activation] in SpecConstr
2. Refines how we set the phase when simplifying the right
hand side of an InlineRule. See
Note [Simplifying inside InlineRules] in SimplUtils.
Most of the extra lines are comments!
The merit of (2) is that a bit more stuff happens inside InlineRules,
and that in turn allows better dead-code elimination.
|
|
|
|
|
|
|
|
|
|
|
| |
* Fix a bug that meant that
(right (inst (forall tv.co) ty))
wasn't getting optimised. This showed up in the
compiled code for ByteCodeItbls
* Add a substitution to optCoercion, so that it simultaneously
substitutes and optimises. Both call sites wanted this, and
optCoercion itself can use it, so it seems a win all round.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The idea is to float out bottoming expressions to top level,
abstracting them over any variables they mention, if necessary. This
is good because it makes functions smaller (and more likely to
inline), by keeping error code out of line.
See Note [Bottoming floats] in SetLevels.
On the way, this fixes the HPC failures for cg059 and friends.
I've been meaning to do this for some time. See Maessen's paper 1999
"Bottom extraction: factoring error handling out of functional
programs" (unpublished I think).
Here are the nofib results:
Program Size Allocs Runtime Elapsed
--------------------------------------------------------------------------------
Min +0.1% -7.8% -14.4% -32.5%
Max +0.5% +0.2% +1.6% +13.8%
Geometric Mean +0.4% -0.2% -4.9% -6.7%
Module sizes
-1 s.d. ----- -2.6%
+1 s.d. ----- +2.3%
Average ----- -0.2%
Compile times:
-1 s.d. ----- -11.4%
+1 s.d. ----- +4.3%
Average ----- -3.8%
I'm think program sizes have crept up because the base library
is bigger -- module sizes in nofib decrease very slightly. In turn
I think that may be because the floating generates a call where
there was no call before. Anyway I think it's acceptable.
The main changes are:
* SetLevels floats out things that exprBotStrictness_maybe
identifies as bottom. Make sure to pin on the right
strictness info to the newly created Ids, so that the
info ends up in interface files.
Since FloatOut is run twice, we have to be careful that we
don't treat the function created by the first float-out as
a candidate for the second; this is what worthFloating does.
See SetLevels Note [Bottoming floats]
Note [Bottoming floats: eta expansion]
* Be careful not to inline top-level bottoming functions; this
would just undo what the floating transformation achieves.
See CoreUnfold Note [Do not inline top-level bottoming functions
Ensuring this requires a bit of extra plumbing, but nothing drastic..
* Similarly pre/postInlineUnconditionally should be
careful not to re-inline top-level bottoming things!
See SimplUtils Note [Top-level botomming Ids]
Note [Top level and postInlineUnconditionally]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch collects a small raft of related changes
* Arrange that during
(a) rule matching and
(b) uses of exprIsConApp_maybe
we "look through" unfoldings only if they are active
in the phase. Doing this for (a) required a bit of
extra plumbing in the rule matching code, but I think
it's worth it.
One wrinkle is that even if inlining is off (in the 'gentle'
phase of simplification) during rule matching we want to
"look through" things with inlinings.
See SimplUtils.activeUnfInRule.
This fixes a long-standing bug, where things that were
supposed to be (say) NOINLINE, could still be poked into
via exprIsConApp_maybe.
* In the above cases, also check for (non-rule) loop breakers;
we never look through these. This fixes a bug that could make
the simplifier diverge (and did for Roman).
Test = simplCore/should_compile/dfun-loop
* Try harder not to choose a DFun as a loop breaker. This is
just a small adjustment in the OccurAnal scoring function
* In the scoring function in OccurAnal, look at the InlineRule
unfolding (if there is one) not the actual RHS, beause the
former is what'll be inlined.
* Make the application of any function to dictionary arguments
CONLIKE. Thus (f d1 d2) is CONLIKE.
Encapsulated in CoreUtils.isExpandableApp
Reason: see Note [Expandable overloadings] in CoreUtils
* Make case expressions seem slightly smaller in CoreUnfold.
This reverses an unexpected consequences of charging for
alternatives.
Refactorings
~~~~~~~~~~~~
* Signficantly refactor the data type for Unfolding (again).
The result is much nicer.
* Add type synonym BasicTypes.CompilerPhase = Int
and use it
Many of the files touched by this patch are simply knock-on
consequences of these two refactorings.
|
|
|
|
|
|
|
| |
I finally got tired of the #ifdef OLD_STRICTNESS stuff. I had been
keeping it around in the hope of doing old-to-new comparisions, but
have failed to do so for many years, so I don't think it's going to
happen. This patch deletes the clutter.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The -fexpose-all-unfoldings flag arranges to put unfoldings for *everything*
in the interface file. Of course, this makes the file a lot bigger, but
it also makes it complete, and that's great for supercompilation; or indeed
any whole-program work.
Consequences:
* Interface files need to record loop-breaker-hood. (Previously,
loop breakers were never exposed, so that info wasn't necessary.)
Hence a small interface file format change.
* When inlining, must check loop-breaker-hood. (Previously, loop
breakers didn't have an unfolding at all, so no need to check.)
* Ditto in exprIsConApp_maybe. Roman actually tripped this bug,
because a DFun, which had an unfolding, was also a loop breaker
* TidyPgm.tidyIdInfo must be careful to preserve loop-breaker-hood
So Id.idUnfolding checks for loop-breaker-hood and returns NoUnfolding
if so. When you want the unfolding regardless of loop-breaker-hood,
use Id.realIdUnfolding.
I have not documented the flag yet, because it's experimental. Nor
have I tested it thoroughly. But with the flag off (the normal case)
everything should work.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These two optimisations were originally done by SimplUtils.mkCase
*after* all the pieces have been simplified. Some while ago I
moved them *before*, so they were done by SimplUtils.prepareAlts.
It think the reason was that I couldn't rely on the dead-binder
information on OutIds, and that info is useful in these optimisations.
However,
(a) Other changes (notably moving case-binder-swap to OccurAnal)
have meant that dead-binder information is accurate in
OutIds
(b) When there is a cascade of case-merges, they happen in
one sweep if you do it after, but in many sweeps if you
do it before. Reason: doing it after means you are looking
at nice simplified Core.
|
|
|
|
|
|
| |
See Note [RULEs apply to simplified arguments] in Simplify.lhs
A knock-on effect is that rules apply *after* we try inlining
(which uses un-simplified arguments), but that seems fine.
|
|
|
|
|
| |
The main change is using SimplUtils.updModeForInlineRules
doesn't overwrite the current setting, it just augments it.
|
| |
|
|
|
|
| |
See Note [Preserve strictness when floating coercions]
|
| |
|
|
|
|
|
|
|
| |
This change helps to break the mutual recursion generated by
an instance declaration.
See Note [Gentle mode] in SimplUtils
|
|
|
|
| |
Seee Note [RHS of lets] in CoreUnfold
|
|
|
|
|
|
| |
I found that a compulsory unfolding was getting dropped on the floor,
so I took that as a hint to tidy up the data type so that it won't
happen again. No big change in functionality.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The patch adds this rule:
seq (x `cast` co) y = seq x y
This is subject to the usual treatment of seq rules. It also makes them
match more often: it will rewrite
seq (f x `cast` co) y = seq (f x) y
and allow a seq rule for f to match.
|
| |
|
|
|
|
|
|
|
| |
* Remove trace from optCoercion
* Use simplCoercion for type arguments in the Simplifier
(because they might be coercions)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch has been a long time in gestation and has, as a
result, accumulated some extra bits and bobs that are only
loosely related. I separated the bits that are easy to split
off, but the rest comes as one big patch, I'm afraid.
Note that:
* It comes together with a patch to the 'base' library
* Interface file formats change slightly, so you need to
recompile all libraries
The patch is mainly giant tidy-up, driven in part by the
particular stresses of the Data Parallel Haskell project. I don't
expect a big performance win for random programs. Still, here are the
nofib results, relative to the state of affairs without the patch
Program Size Allocs Runtime Elapsed
--------------------------------------------------------------------------------
Min -12.7% -14.5% -17.5% -17.8%
Max +4.7% +10.9% +9.1% +8.4%
Geometric Mean +0.9% -0.1% -5.6% -7.3%
The +10.9% allocation outlier is rewrite, which happens to have a
very delicate optimisation opportunity involving an interaction
of CSE and inlining (see nofib/Simon-nofib-notes). The fact that
the 'before' case found the optimisation is somewhat accidental.
Runtimes seem to go down, but I never kno wwhether to really trust
this number. Binary sizes wobble a bit, but nothing drastic.
The Main Ideas are as follows.
InlineRules
~~~~~~~~~~~
When you say
{-# INLINE f #-}
f x = <rhs>
you intend that calls (f e) are replaced by <rhs>[e/x] So we
should capture (\x.<rhs>) in the Unfolding of 'f', and never meddle
with it. Meanwhile, we can optimise <rhs> to our heart's content,
leaving the original unfolding intact in Unfolding of 'f'.
So the representation of an Unfolding has changed quite a bit
(see CoreSyn). An INLINE pragma gives rise to an InlineRule
unfolding.
Moreover, it's only used when 'f' is applied to the
specified number of arguments; that is, the number of argument on
the LHS of the '=' sign in the original source definition.
For example, (.) is now defined in the libraries like this
{-# INLINE (.) #-}
(.) f g = \x -> f (g x)
so that it'll inline when applied to two arguments. If 'x' appeared
on the left, thus
(.) f g x = f (g x)
it'd only inline when applied to three arguments. This slightly-experimental
change was requested by Roman, but it seems to make sense.
Other associated changes
* Moving the deck chairs in DsBinds, which processes the INLINE pragmas
* In the old system an INLINE pragma made the RHS look like
(Note InlineMe <rhs>)
The Note switched off optimisation in <rhs>. But it was quite
fragile in corner cases. The new system is more robust, I believe.
In any case, the InlineMe note has disappeared
* The workerInfo of an Id has also been combined into its Unfolding,
so it's no longer a separate field of the IdInfo.
* Many changes in CoreUnfold, esp in callSiteInline, which is the critical
function that decides which function to inline. Lots of comments added!
* exprIsConApp_maybe has moved to CoreUnfold, since it's so strongly
associated with "does this expression unfold to a constructor application".
It can now do some limited beta reduction too, which Roman found
was an important.
Instance declarations
~~~~~~~~~~~~~~~~~~~~~
It's always been tricky to get the dfuns generated from instance
declarations to work out well. This is particularly important in
the Data Parallel Haskell project, and I'm now on my fourth attempt,
more or less.
There is a detailed description in TcInstDcls, particularly in
Note [How instance declarations are translated]. Roughly speaking
we now generate a top-level helper function for every method definition
in an instance declaration, so that the dfun takes a particularly
stylised form:
dfun a d1 d2 = MkD (op1 a d1 d2) (op2 a d1 d2) ...etc...
In fact, it's *so* stylised that we never need to unfold a dfun.
Instead ClassOps have a special rewrite rule that allows us to
short-cut dictionary selection. Suppose dfun :: Ord a -> Ord [a]
d :: Ord a
Then
compare (dfun a d) --> compare_list a d
in one rewrite, without first inlining the 'compare' selector
and the body of the dfun.
To support this
a) ClassOps have a BuiltInRule (see MkId.dictSelRule)
b) DFuns have a special form of unfolding (CoreSyn.DFunUnfolding)
which is exploited in CoreUnfold.exprIsConApp_maybe
Implmenting all this required a root-and-branch rework of TcInstDcls
and bits of TcClassDcl.
Default methods
~~~~~~~~~~~~~~~
If you give an INLINE pragma to a default method, it should be just
as if you'd written out that code in each instance declaration, including
the INLINE pragma. I think that it now *is* so. As a result, library
code can be simpler; less duplication.
The CONLIKE pragma
~~~~~~~~~~~~~~~~~~
In the DPH project, Roman found cases where he had
p n k = let x = replicate n k
in ...(f x)...(g x)....
{-# RULE f (replicate x) = f_rep x #-}
Normally the RULE would not fire, because doing so involves
(in effect) duplicating the redex (replicate n k). A new
experimental modifier to the INLINE pragma, {-# INLINE CONLIKE
replicate #-}, allows you to tell GHC to be prepared to duplicate
a call of this function if it allows a RULE to fire.
See Note [CONLIKE pragma] in BasicTypes
Join points
~~~~~~~~~~~
See Note [Case binders and join points] in Simplify
Other refactoring
~~~~~~~~~~~~~~~~~
* I moved endPass from CoreLint to CoreMonad, with associated jigglings
* Better pretty-printing of Core
* The top-level RULES (ones that are not rules for locally-defined things)
are now substituted on every simplifier iteration. I'm not sure how
we got away without doing this before. This entails a bit more plumbing
in SimplCore.
* The necessary stuff to serialise and deserialise the new
info across interface files.
* Something about bottoming floats in SetLevels
Note [Bottoming floats]
* substUnfolding has moved from SimplEnv to CoreSubs, where it belongs
--------------------------------------------------------------------------------
Program Size Allocs Runtime Elapsed
--------------------------------------------------------------------------------
anna +2.4% -0.5% 0.16 0.17
ansi +2.6% -0.1% 0.00 0.00
atom -3.8% -0.0% -1.0% -2.5%
awards +3.0% +0.7% 0.00 0.00
banner +3.3% -0.0% 0.00 0.00
bernouilli +2.7% +0.0% -4.6% -6.9%
boyer +2.6% +0.0% 0.06 0.07
boyer2 +4.4% +0.2% 0.01 0.01
bspt +3.2% +9.6% 0.02 0.02
cacheprof +1.4% -1.0% -12.2% -13.6%
calendar +2.7% -1.7% 0.00 0.00
cichelli +3.7% -0.0% 0.13 0.14
circsim +3.3% +0.0% -2.3% -9.9%
clausify +2.7% +0.0% 0.05 0.06
comp_lab_zift +2.6% -0.3% -7.2% -7.9%
compress +3.3% +0.0% -8.5% -9.6%
compress2 +3.6% +0.0% -15.1% -17.8%
constraints +2.7% -0.6% -10.0% -10.7%
cryptarithm1 +4.5% +0.0% -4.7% -5.7%
cryptarithm2 +4.3% -14.5% 0.02 0.02
cse +4.4% -0.0% 0.00 0.00
eliza +2.8% -0.1% 0.00 0.00
event +2.6% -0.0% -4.9% -4.4%
exp3_8 +2.8% +0.0% -4.5% -9.5%
expert +2.7% +0.3% 0.00 0.00
fem -2.0% +0.6% 0.04 0.04
fft -6.0% +1.8% 0.05 0.06
fft2 -4.8% +2.7% 0.13 0.14
fibheaps +2.6% -0.6% 0.05 0.05
fish +4.1% +0.0% 0.03 0.04
fluid -2.1% -0.2% 0.01 0.01
fulsom -4.8% +9.2% +9.1% +8.4%
gamteb -7.1% -1.3% 0.10 0.11
gcd +2.7% +0.0% 0.05 0.05
gen_regexps +3.9% -0.0% 0.00 0.00
genfft +2.7% -0.1% 0.05 0.06
gg -2.7% -0.1% 0.02 0.02
grep +3.2% -0.0% 0.00 0.00
hidden -0.5% +0.0% -11.9% -13.3%
hpg -3.0% -1.8% +0.0% -2.4%
ida +2.6% -1.2% 0.17 -9.0%
infer +1.7% -0.8% 0.08 0.09
integer +2.5% -0.0% -2.6% -2.2%
integrate -5.0% +0.0% -1.3% -2.9%
knights +4.3% -1.5% 0.01 0.01
lcss +2.5% -0.1% -7.5% -9.4%
life +4.2% +0.0% -3.1% -3.3%
lift +2.4% -3.2% 0.00 0.00
listcompr +4.0% -1.6% 0.16 0.17
listcopy +4.0% -1.4% 0.17 0.18
maillist +4.1% +0.1% 0.09 0.14
mandel +2.9% +0.0% 0.11 0.12
mandel2 +4.7% +0.0% 0.01 0.01
minimax +3.8% -0.0% 0.00 0.00
mkhprog +3.2% -4.2% 0.00 0.00
multiplier +2.5% -0.4% +0.7% -1.3%
nucleic2 -9.3% +0.0% 0.10 0.10
para +2.9% +0.1% -0.7% -1.2%
paraffins -10.4% +0.0% 0.20 -1.9%
parser +3.1% -0.0% 0.05 0.05
parstof +1.9% -0.0% 0.00 0.01
pic -2.8% -0.8% 0.01 0.02
power +2.1% +0.1% -8.5% -9.0%
pretty -12.7% +0.1% 0.00 0.00
primes +2.8% +0.0% 0.11 0.11
primetest +2.5% -0.0% -2.1% -3.1%
prolog +3.2% -7.2% 0.00 0.00
puzzle +4.1% +0.0% -3.5% -8.0%
queens +2.8% +0.0% 0.03 0.03
reptile +2.2% -2.2% 0.02 0.02
rewrite +3.1% +10.9% 0.03 0.03
rfib -5.2% +0.2% 0.03 0.03
rsa +2.6% +0.0% 0.05 0.06
scc +4.6% +0.4% 0.00 0.00
sched +2.7% +0.1% 0.03 0.03
scs -2.6% -0.9% -9.6% -11.6%
simple -4.0% +0.4% -14.6% -14.9%
solid -5.6% -0.6% -9.3% -14.3%
sorting +3.8% +0.0% 0.00 0.00
sphere -3.6% +8.5% 0.15 0.16
symalg -1.3% +0.2% 0.03 0.03
tak +2.7% +0.0% 0.02 0.02
transform +2.0% -2.9% -8.0% -8.8%
treejoin +3.1% +0.0% -17.5% -17.8%
typecheck +2.9% -0.3% -4.6% -6.6%
veritas +3.9% -0.3% 0.00 0.00
wang -6.2% +0.0% 0.18 -9.8%
wave4main -10.3% +2.6% -2.1% -2.3%
wheel-sieve1 +2.7% -0.0% +0.3% -0.6%
wheel-sieve2 +2.7% +0.0% -3.7% -7.5%
x2n1 -4.1% +0.1% 0.03 0.04
--------------------------------------------------------------------------------
Min -12.7% -14.5% -17.5% -17.8%
Max +4.7% +10.9% +9.1% +8.4%
Geometric Mean +0.9% -0.1% -5.6% -7.3%
|
|
|
|
|
|
|
|
|
|
|
|
| |
Coercion terms can get big (see Trac #2859 for example), so this
patch puts the infrastructure in place to optimise them:
* Adds Coercion.optCoercion :: Coercion -> Coercion
* Calls optCoercion in Simplify.lhs
The optimiser doesn't work right at the moment, so it is
commented out, but Tom is going to work on it.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes test failures for the profiling way for drv001.
The problem was that the arity of a function was decreasing during
"optimisation" because of interaction with SCC annotations.
In particular
f = /\a. scc "f" (h x) -- where h had arity 2
and h gets inlined, led to
f = /\a. scc "f" let v = scc "f" x in \y. <blah>
Two main changes:
1. exprIsTrivial now says True for (scc "f" x)
See Note [SCCs are trivial] in CoreUtils
2. The simplifier eliminates nested pushing of the same cost centre:
scc "f" (...(scc "f" e)...)
==> scc "f" (...e...)
|