| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
The "InlineRule" is gone now, so this is just making the comments line
up with the code. A function does change its name though:
updModeForInlineRules --> updModeForStableUnfoldings
|
| |
|
|
|
|
|
|
|
|
|
| |
It's not obvious why the simplifier generates code that correctly satisfies
the let/app invariant. This patch does some minor refactoring, but the main
point is to document pre-conditions to key functions, namely that the rhs
passed in satisfies the let/app invariant.
There shouldn't be any change in behaviour.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In some cases, the layout of the LANGUAGE/OPTIONS_GHC lines has been
reorganized, while following the convention, to
- place `{-# LANGUAGE #-}` pragmas at the top of the source file, before
any `{-# OPTIONS_GHC #-}`-lines.
- Moreover, if the list of language extensions fit into a single
`{-# LANGUAGE ... -#}`-line (shorter than 80 characters), keep it on one
line. Otherwise split into `{-# LANGUAGE ... -#}`-lines for each
individual language extension. In both cases, try to keep the
enumeration alphabetically ordered.
(The latter layout is preferable as it's more diff-friendly)
While at it, this also replaces obsolete `{-# OPTIONS ... #-}` pragma
occurences by `{-# OPTIONS_GHC ... #-}` pragmas.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
See Note [Do not eta-expand PAPs] in SimplUtils. This has a tremendously
good effect on compile times for some simple benchmarks.
The test is now where it belongs, in perf/compiler/T9020 (instead of simpl015).
I did a nofib run and got essentially zero change except for cacheprof which
gets 4% more allocation. I investigated. Turns out that we have
instance PP Reg where
pp ppm ST_0 = "%st"
pp ppm ST_1 = "%st(1)"
pp ppm ST_2 = "%st(2)"
pp ppm ST_3 = "%st(3)"
pp ppm ST_4 = "%st(4)"
pp ppm ST_5 = "%st(5)"
pp ppm ST_6 = "%st(6)"
pp ppm ST_7 = "%st(7)"
pp ppm r = "%" ++ map toLower (show r)
That (map toLower (show r) does a lot of map/toLowers. But if we inline show
we get something like
pp ppm ST_0 = "%st"
pp ppm ST_1 = "%st(1)"
pp ppm ST_2 = "%st(2)"
pp ppm ST_3 = "%st(3)"
pp ppm ST_4 = "%st(4)"
pp ppm ST_5 = "%st(5)"
pp ppm ST_6 = "%st(6)"
pp ppm ST_7 = "%st(7)"
pp ppm EAX = map toLower (show EAX)
pp ppm EBX = map toLower (show EBX)
...etc...
and all those map/toLower calls can now be floated to top level.
This gives a 4% decrease in allocation. But it depends on inlining
a pretty big 'show' function.
With this new patch we get slightly better eta-expansion, which makes
a function look slightly bigger, which just stops it being inlined.
The previous behaviour was luck, so I'm not going to worry about
losing it.
I've added some notes to nofib/Simon-nofib-notes
|
|
|
|
|
|
|
|
|
|
| |
See Note [Demand analysis for trivial right-hand sides] in DmdAnal.
This allows a function with arity 2 to have a DmdSig with 3 args;
which in turn had a knock-on effect, which showed up in the test for
Trac #8963.
In fact it seems entirely reasonable, so this patch removes the
WARN and CoreLint checks that were complaining.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This analysis finds out if a let-bound expression with lower manifest
arity than type arity is always called with more arguments, as in that
case eta-expansion is allowed and often viable. The analysis is very
much tailored towards the code generated when foldl is implemented via
foldr; without this analysis doing so would be a very bad idea!
There are other ways to improve foldr/builder-fusion to cope with foldl,
if any of these are implemented then this step can probably be moved to
-O2 to save some compilation times. The current impact of adding this
phase is just below +2% (measured running GHC's "make").
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The presenting issue was that we were never eta-expanding
f (\x -> case x of (a,b) -> \s -> blah)
and that meant we were allocating two lambdas instead of one.
See Note [Eta expanding lambdas] in SimplUtils.
However I didn't want to eta expand the lambda, and then try all over
again for tryEtaExpandRhs. Yet the latter is important in the context
of a let-binding it can do simple arity analysis. So I ended up
refactoring CallCtxt so that it tells when we are on the RHS of a let.
I also moved findRhsArity from SimplUtils to CoreArity.
Performance increases nicely. Here are the ones where allocation improved
by more than 0.5%. Notice the nice decrease in binary size too.
--------------------------------------------------------------------------------
Program Size Allocs Runtime Elapsed TotalMem
--------------------------------------------------------------------------------
ansi -2.3% -0.9% 0.00 0.00 +0.0%
bspt -2.1% -9.7% 0.01 0.01 -33.3%
fasta -1.8% -11.7% -3.4% -3.6% +0.0%
fft -1.9% -1.3% 0.06 0.06 +11.1%
reverse-complem -1.9% -18.1% -1.9% -2.8% +0.0%
sphere -1.8% -4.5% 0.09 0.09 +0.0%
transform -1.8% -2.3% -4.6% -3.1% +0.0%
--------------------------------------------------------------------------------
Min -3.0% -18.1% -13.9% -14.6% -35.7%
Max -1.3% +0.0% +7.7% +7.7% +50.0%
Geometric Mean -1.9% -0.6% -2.1% -2.1% -0.2%
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Roles are a solution to the GeneralizedNewtypeDeriving type-safety
problem.
Roles were first described in the "Generative type abstraction" paper,
by Stephanie Weirich, Dimitrios Vytiniotis, Simon PJ, and Steve Zdancewic.
The implementation is a little different than that paper. For a quick
primer, check out Note [Roles] in Coercion. Also see
http://ghc.haskell.org/trac/ghc/wiki/Roles
and
http://ghc.haskell.org/trac/ghc/wiki/RolesImplementation
For a more formal treatment, check out docs/core-spec/core-spec.pdf.
This fixes Trac #1496, #4846, #7148.
|
|
|
|
|
|
|
| |
E.g. (f e1 |> g) e2
If f is strict in two aguments, we want to see that in e2
Hence ArgSpec in SimplUtils
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a long-standing regression (Trac #7797), which meant that in
particular the Eq [Char] instance does not get specialised.
(The *methods* do, but the dictionary itself doesn't.) So when you
call a function
f :: Eq a => blah
on a string type (ie a=[Char]), 7.6 passes a dictionary of un-specialised
methods.
This only matters when calling an overloaded function from a
specialised context, but that does matter in some programs. I
remember (though I cannot find the details) that Nick Frisby discovered
this to be the source of some pretty solid performanc regresisons.
Anyway it works now. The key change is that a DFunUnfolding now takes
a form that is both simpler than before (the DFunArg type is eliminated)
and more general:
data Unfolding
= ...
| DFunUnfolding { -- The Unfolding of a DFunId
-- See Note [DFun unfoldings]
-- df = /\a1..am. \d1..dn. MkD t1 .. tk
-- (op1 a1..am d1..dn)
-- (op2 a1..am d1..dn)
df_bndrs :: [Var], -- The bound variables [a1..m],[d1..dn]
df_con :: DataCon, -- The dictionary data constructor (never a newtype datacon)
df_args :: [CoreExpr] -- Args of the data con: types, superclasses and methods,
} -- in positional order
That in turn allowed me to re-enable the DFunUnfolding specialisation in
DsBinds. Lots of details here in TcInstDcls:
Note [SPECIALISE instance pragmas]
I also did some refactoring, in particular to pass the InScopeSet to
exprIsConApp_maybe (which in turn means it has to go to a RuleFun).
NB: Interface file format has changed!
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move the "combine indentical alternatives" transformation *before*
simplifying the alternatives. For example
case x of y
[] -> length y
(_:_) -> length y }
If we look *post* simplification, since 'y' is used in the
alterantives, the case binders *might* be (see the keep_occ_info test
in Simplify.simplAlt); and hence the combination of the two
alteranatives does not happen. But if we do it *pre* simplification
there is no problem.
This fixes Trac #7360.
|
|
|
|
|
| |
Mostly d -> g (matching DynFlag -> GeneralFlag).
Also renamed if* to when*, matching the Haskell if/when names
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
I also removed the default values from the "Discounts and thresholds"
note: most of them were no longer up-to-date.
Along the way I added FloatSuffix to the argument parser, analogous to
IntSuffix.
|
| |
|
|
|
|
|
|
|
| |
* Put the result type in the Stop continuation
* No need for the alts type in Select
The result is a modest but useful simplification
|
|
|
|
| |
compiler/simplCore/SimplUtils.lhs:1668:5-25: Irrefutable pattern failed for pattern ((_, _, rhs1) : _)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch allows, for the first time, case expressions with an empty
list of alternatives. Max suggested the idea, and Trac #6067 showed
that it is really quite important.
So I've implemented the idea, fixing #6067. Main changes
* See Note [Empty case alternatives] in CoreSyn
* Various foldr1's become foldrs
* IfaceCase does not record the type of the alternatives.
I added IfaceECase for empty-alternative cases.
* Core Lint does not complain about empty cases
* MkCore.castBottomExpr constructs an empty-alternative case
expression (case e of ty {})
* CoreToStg converts '(case e of {})' to just 'e'
|
|
|
|
|
|
| |
This reverts commit e3f8557c2aca04cf64eec6a1aacde6e01c0944ff.
Sigh. Seg fault.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the previous commit about "aggressive primops" I wanted a new
function CoreUtils.exprCertainlyTerminates. In doing this I ended up
with a significant refactoring in CoreUtils. The new structure has
quite a lot of nice sharing:
exprIsCheap = exprIsCheap' isHNFApp
exprIsExpandable = exprIsCheap' isConLikeApp
exprIsHNF = exprIsHNFlike isHNFApp
exprIsConLike = exprIsHNFlike isConLikeApp
exprCertainlyTerminates = exprIsHNFlike isTerminatingApp
This patch also does some renaming
CheapAppFun --> FunAppAnalyser
isCheapApp --> isHNFApp
isExpandableApp --> isConLikeApp
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Makes the following changes:
1. Generalises the type signatures of some functions relating to alternatives
so that the type of "variables" and "expression" is not specified
2. Puts the bulk of the alternative-filtering code into a new function filterAlts
(in CoreUtils) that can be used outside of the SimplM monad
3. Allows prepareAlts to return a null alternatives list if none are applicable -
it turns out that this case was already handled by the caller (in the simplifier).
This should result in a modest optimisation improvement in some cases.
Conflicts:
compiler/coreSyn/CoreUtils.lhs
compiler/simplCore/SimplUtils.lhs
|
| |
|
| |
|
|
|
|
| |
This reverts commit 601c983dd0bada6b49bdadd8f172fd4eacac4b0c.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I'm experimenting with making GHC a bit more aggressive about
a) dropping case expressions if the result is unused
Simplify.rebuildCase, CaseElim equation
b) floating case expressions inwards
FloatIn.fiExpr, AnnCase
In both cases the new behaviour is gotten with a static (debug)
flag -faggressive-primops. The extra "aggression" is to allow
discarding and floating in for side-effecting operations. See
the new, extensive Note [PrimOp can_fail and has_side_effects]
in PrimoOp.
When discarding a case with unused binders, in the lifted-type
case it's definitely ok if the scrutinee terminates; previously
we were checking exprOkForSpeculation, which is significantly
worse.
So I wanted a new function CoreUtils.exprCertainlyTerminates.
In doing this I ended up with a significant refactoring in
CoreUtils. The new structure has quite a lot of nice sharing:
exprIsCheap = exprIsCheap' isHNFApp
exprIsExpandable = exprIsCheap' isConLikeApp
exprIsHNF = exprIsHNFlike isHNFApp
exprIsConLike = exprIsHNFlike isConLikeApp
exprCertainlyTerminates = exprIsHNFlike isTerminatingApp
|
| |
|
|
|
|
|
|
| |
I did a bit of refactoring (of course) at the same time.
See the discussion in Trac #5587. Most of the real change
is in CoreArity.
|
|
|
|
|
|
| |
In fixing one bug I'd introduced another;
case x of { T -> T; F -> F }
wasn't getting optmised! Trivial to fix.
|
|
|
|
|
|
|
|
|
| |
This big patch implements a kind-polymorphic core for GHC. The current
implementation focuses on making sure that all kind-monomorphic programs still
work in the new core; it is not yet guaranteed that kind-polymorphic programs
(using the new -XPolyKinds flag) will work.
For more information, see http://haskell.org/haskellwiki/GHC/Kinds
|
|
|
|
|
|
|
|
|
| |
As usual it was to do with the handling of bottoms,
but this time it wasn't terribly subtle; I was using
andArityType (which is designed for case branches) as
a cheap short cut for the arity trimming needed with
a cast. That did the Wrong Thing for bottoming
expressions. Sigh.
|
|
|
|
|
| |
We only use it for "compiler" sources, i.e. not for libraries.
Many modules have a -fno-warn-tabs kludge for now.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
User visible changes
====================
Profilng
--------
Flags renamed (the old ones are still accepted for now):
OLD NEW
--------- ------------
-auto-all -fprof-auto
-auto -fprof-exported
-caf-all -fprof-cafs
New flags:
-fprof-auto Annotates all bindings (not just top-level
ones) with SCCs
-fprof-top Annotates just top-level bindings with SCCs
-fprof-exported Annotates just exported bindings with SCCs
-fprof-no-count-entries Do not maintain entry counts when profiling
(can make profiled code go faster; useful with
heap profiling where entry counts are not used)
Cost-centre stacks have a new semantics, which should in most cases
result in more useful and intuitive profiles. If you find this not to
be the case, please let me know. This is the area where I have been
experimenting most, and the current solution is probably not the
final version, however it does address all the outstanding bugs and
seems to be better than GHC 7.2.
Stack traces
------------
+RTS -xc now gives more information. If the exception originates from
a CAF (as is common, because GHC tends to lift exceptions out to the
top-level), then the RTS walks up the stack and reports the stack in
the enclosing update frame(s).
Result: +RTS -xc is much more useful now - but you still have to
compile for profiling to get it. I've played around a little with
adding 'head []' to GHC itself, and +RTS -xc does pinpoint the problem
quite accurately.
I plan to add more facilities for stack tracing (e.g. in GHCi) in the
future.
Coverage (HPC)
--------------
* derived instances are now coloured yellow if they weren't used
* likewise record field names
* entry counts are more accurate (hpc --fun-entry-count)
* tab width is now correct (markup was previously off in source with
tabs)
Internal changes
================
In Core, the Note constructor has been replaced by
Tick (Tickish b) (Expr b)
which is used to represent all the kinds of source annotation we
support: profiling SCCs, HPC ticks, and GHCi breakpoints.
Depending on the properties of the Tickish, different transformations
apply to Tick. See CoreUtils.mkTick for details.
Tickets
=======
This commit closes the following tickets, test cases to follow:
- Close #2552: not a bug, but the behaviour is now more intuitive
(test is T2552)
- Close #680 (test is T680)
- Close #1531 (test is result001)
- Close #949 (test is T949)
- Close #2466: test case has bitrotted (doesn't compile against current
version of vector-space package)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we have
x = \v -> map g
we want to eta-expand to
x = \v y -> map g y
We weren't doing so, and that led to worse code and,
perhaps, #5285. I need to check the latter, but I
was certainly seeing one similar error. Anyway
this looks like a definite improvement
|
|
|
|
|
|
|
|
|
|
| |
I realised that my recently-added cunning stuff about
RULES for imported Ids was simply wrong, so this patch
removes it. See Note [Rules for imported functions],
which explains it all.
This patch also does quite a bit of refactoring in
the treatment of loop breakers.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Fix bugs in the packing and unpacking of data
constructors with equality predicates in their types
* Remove PredCo altogether; instead, coercions between predicated
types (like (Eq a, [a]~b) => blah) are treated as if they
were precisely their underlying representation type
Eq a -> ((~) [a] b) -> blah
in this case
* Similarly, Type.coreView no longer treats equality
predciates specially.
* Implement the cast-of-coercion optimisation in
Simplify.simplCoercionF
Numerous other small bug-fixes and refactorings.
Annoyingly, OptCoercion had Windows line endings, and this
patch switches to Unix, so it looks as if every line has changed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
See the paper "Practical aspects of evidence based compilation in System FC"
* Coercion becomes a data type, distinct from Type
* Coercions become value-level things, rather than type-level things,
(although the value is zero bits wide, like the State token)
A consequence is that a coerion abstraction increases the arity by 1
(just like a dictionary abstraction)
* There is a new constructor in CoreExpr, namely Coercion, to inject
coercions into terms
|
|
|
|
|
|
| |
I'd failed to propagate the Opt_EnableRewriteRules flag properly,
which meant that -fno-enable-rewrite-rules didn't disable all
rewrites. This patch fixes it.
|
|
|
|
|
|
| |
Under obscure circumstances (actually only shown up when fixing something
else) it was possible for a variable binding to be discarded although
it was still used. See Note [Top level and postInlineUnconditionally]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I've wanted to do this for ages, but never gotten around to
it. The main notes are in Note [Arity analysis] in SimplUtils.
The motivating example for arity analysis is this:
f = \x. let g = f (x+1)
in \y. ...g...
What arity does f have? Really it should have arity 2, but a naive
look at the RHS won't see that. You need a fixpoint analysis which
says it has arity "infinity" the first time round.
This makes things more robust to the way in which you write code. For
example, see Trac #4474 which is fixed by this change.
Not a huge difference, but worth while:
Program Size Allocs Runtime Elapsed
--------------------------------------------------------------------------------
Min -0.4% -2.2% -10.0% -10.0%
Max +2.7% +0.3% +7.1% +6.9%
Geometric Mean -0.3% -0.2% -2.1% -2.2%
I don't really believe the runtime numbers, because the machine was
busy, but the bottom line is that not much changes, and what does
change reliably (allocation and size) is in the right direction.
|