| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Don't export `getUs` and `getUniqueUs`. `UniqSM` has a `MonadUnique` instance:
instance MonadUnique UniqSM where
getUniqueSupplyM = getUs
getUniqueM = getUniqueUs
getUniquesM = getUniquesUs
Commandline-fu used:
git grep -l 'getUs\>' |
grep -v compiler/basicTypes/UniqSupply.lhs |
xargs sed -i 's/getUs/getUniqueSupplyM/g
git grep -l 'getUniqueUs\>' |
grep -v combiler/basicTypes/UniqSupply.lhs |
xargs sed -i 's/getUniqueUs/getUniqueM/g'
Follow up on b522d3a3f970a043397a0d6556ca555648e7a9c3
Reviewed By: austin, hvr
Differential Revision: https://phabricator.haskell.org/D220
|
|
|
|
|
|
|
|
|
|
|
| |
This patch should make no change in behaviour.
* Make RhsInfo into a record
* Include ri_rhs_usg, which previously travelled around separately
* Introduce specRec, specNonRec, and
make them return [OneSpec] rather than SpecInfo
|
|
|
|
|
|
|
|
|
|
| |
This long-standing and egregious bug meant that call information was
being gratuitously copied, leading to an exponential blowup in the
number of calls to be examined when function definitions are deeply
nested. That is what has been causing the blowup in SpecConstr's
running time, not (as I had previously supposed) generating very large code.
See Note [spec_usg includes rhs_usg]
|
|
|
|
|
| |
This is just a small refactoring that makes the code a bit clearer,
using a data type instead of a triple. We get better pretty-printing too.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In some cases, the layout of the LANGUAGE/OPTIONS_GHC lines has been
reorganized, while following the convention, to
- place `{-# LANGUAGE #-}` pragmas at the top of the source file, before
any `{-# OPTIONS_GHC #-}`-lines.
- Moreover, if the list of language extensions fit into a single
`{-# LANGUAGE ... -#}`-line (shorter than 80 characters), keep it on one
line. Otherwise split into `{-# LANGUAGE ... -#}`-lines for each
individual language extension. In both cases, try to keep the
enumeration alphabetically ordered.
(The latter layout is preferable as it's more diff-friendly)
While at it, this also replaces obsolete `{-# OPTIONS ... #-}` pragma
occurences by `{-# OPTIONS_GHC ... #-}` pragmas.
|
| |
|
| |
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
|
|
| |
This occurs when doing bootstraps with the 7.7 compiler.
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Joachim and I are committing this onto a branch so that we can share it,
but we expect to do a bit more work before merging it onto head.
Nofib staus:
- Most programs, no change
- A few improve
- A couple get worse (cacheprof, tak, rfib)
Investigating the "get worse" set is what's holding up putting this
on head.
The major issue is this. Consider
map (f g) ys
where f's demand signature looks like
f :: <L,C1(C1(U))> -> <L,U> -> .
So 'f' is not saturated. What demand do we place on g?
Answer
C(C1(U))
That is, the inner C1 should stay, even though f is not saturated.
I found that this made a significant difference in the demand signatures
inferred in GHC.IO, which uses lots of higher-order exception handlers.
I also had to add used-once demand signatures for some of the
'catch' primops, so that we know their handlers are only called once.
|
|
|
|
|
| |
because it is not a top deman (see previous commit), and it is only used
in an argument to mkStrictSig.
|
|
|
|
|
|
|
|
|
|
| |
* Note new SPEC type in release notes.
* Document SPEC in the users guide under the documentation for
-fspec-constr.
* Clean up comments in SpecConstr regarding the forcing of
specialisation (see Note [Forcing specialisation].)
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SpecConstr has for a while now looked for types with the built
in ForceSpecConstr annotation, in order to know where to be particularly
aggressive.
Unfortunately using an annotation has a number of downsides, the most
prominent two being:
A) ForceSpecConstr is vital for efficiency (even if it's
a hack), but it means users of it must have GHCI - even though
stage2 features are not required for anything but the annotation.
B) Any user who might need it (read: vector) has to duplicate the same
piece of code. In general there are few people actually doing this,
but it's unclear why they should have to.
This patch makes SpecConstr look for functions applied to the new
GHC.Types.SPEC type - a copy of the already-extant 'SPEC' type - as well
as look for annotations, in the stage2 compiler.
In particular, this means `vector` can now be built with a stage1
compiler, since it no longer depends on stage2 for anything else. This
is particularly important for e.g. iOS cross-compilers.
This also means we should be able to build `vector` earlier in the build
process too, but this patch doesn't address that.
This requires an accompanying bump in ghc-prim.
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Roles are a solution to the GeneralizedNewtypeDeriving type-safety
problem.
Roles were first described in the "Generative type abstraction" paper,
by Stephanie Weirich, Dimitrios Vytiniotis, Simon PJ, and Steve Zdancewic.
The implementation is a little different than that paper. For a quick
primer, check out Note [Roles] in Coercion. Also see
http://ghc.haskell.org/trac/ghc/wiki/Roles
and
http://ghc.haskell.org/trac/ghc/wiki/RolesImplementation
For a more formal treatment, check out docs/core-spec/core-spec.pdf.
This fixes Trac #1496, #4846, #7148.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When specialising a top-level recursive group, if none of the binders
are exported then we can start specialising based on the later calls to
the functions.
This is instead of creating specialisations based on the RHS of the
bindings.
The main benefit of this is that only specialisations that will actually
be used are created. This saves quite a bit of memory when compiling
stream-fusion and ForceSpecConstr sort of code.
Nofib has an average allocation and runtime of -0.7%, maximum 2%.
There are a few with significant decreases in allocation (10 - 20%)
but, interestingly, those ones seem to have similar runtimes.
One of these does have a significantly reduced total elapsed time
though: -38%.
On average the nofib compilation times are the same, but they do vary
with s.d. of -4 to 4%.
I think this is acceptable because of the fairly major code blowup fixes
this has for fusion-style code.
(In one example, a SpecConstr was previously producing 122,000 term size,
now only produces 28,000 with the same object code)
|
|
|
|
|
|
|
| |
This is a bad bug, if a rare one.
See Note [Work-free values only in environment].
Thanks to Amos Robinson for finding it.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
size_expr now ignores RealWorld lambdas, arguments, and applications.
Worker-wrapper previously removed all lambdas from a function, if they
were all unused. Removing *all* value lambdas is no longer
allowed. Instead (\_ -> E) will become (\_void -> E), where it used to
become E. The previous behavior can be recovered via the new
-ffun-to-thunk flag.
Nofib notables:
----------------------------------------------------------------
Program O2 O2 newly ignoring RealWorld
and not turning function
closures into thunks
----------------------------------------------------------------
Allocations
comp_lab_zift 333090392% -5.0%
reverse-complem 155188304% -3.2%
rewrite 15380888% +4.0%
boyer2 3901064% +7.5%
rewrite previously benefited from fortunate LoopBreaker choice that is
now disrupted.
A function in boyer2 goes from $wonewayunify1 size 700 to size 650,
thus gets inlined into rewritelemmas, thus exposing a parameter
scrutinisation, thus allowing SpecConstr, which unfortunately involves
reboxing.
Run Time
fannkuch-redux 7.89% -15.9%
hpg 0.25% +5.6%
wang 0.21% +5.8%
/shrug
|
|
|
|
|
| |
ForceSpecConstr will now only specialise recursive types a finite number of times.
There is a new option -fspec-constr-recursive, with a default value of 3.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The main payload of this patch is to extend CPR so that it
detects when a function always returns a result constructed
with the *same* constructor, even if the constructor comes from
a sum type. This doesn't matter very often, but it does improve
some things (results below).
Binary sizes increase a little bit, I think because there are more
wrappers. This with -split-objs. Without split-ojbs binary sizes
increased by 6% even for HelloWorld.hs. It's hard to see exactly why,
but I think it was because System.Posix.Types.o got included in the
linked binary, whereas it didn't before.
Program Size Allocs Runtime Elapsed TotalMem
fluid +1.8% -0.3% 0.01 0.01 +0.0%
tak +2.2% -0.2% 0.02 0.02 +0.0%
ansi +1.7% -0.3% 0.00 0.00 +0.0%
cacheprof +1.6% -0.3% +0.6% +0.5% +1.4%
parstof +1.4% -4.4% 0.00 0.00 +0.0%
reptile +2.0% +0.3% 0.02 0.02 +0.0%
----------------------------------------------------------------------
Min +1.1% -4.4% -4.7% -4.7% -15.0%
Max +2.3% +0.3% +8.3% +9.4% +50.0%
Geometric Mean +1.9% -0.1% +0.6% +0.7% +0.3%
Other things in this commit
~~~~~~~~~~~~~~~~~~~~~~~~~~~
* Got rid of the Lattice class in Demand
* Refactored the way that products and newtypes are
decomposed (no change in functionality)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch is the result of Ilya Sergey's internship at MSR. It
constitutes a thorough overhaul and simplification of the demand
analyser. It makes a solid foundation on which we can now build.
Main changes are
* Instead of having one combined type for Demand, a Demand is
now a pair (JointDmd) of
- a StrDmd and
- an AbsDmd.
This allows strictness and absence to be though about quite
orthogonally, and greatly reduces brain melt-down.
* Similarly in the DmdResult type, it's a pair of
- a PureResult (indicating only divergence/non-divergence)
- a CPRResult (which deals only with the CPR property
* In IdInfo, the
strictnessInfo field contains a StrictSig, not a Maybe StrictSig
demandInfo field contains a Demand, not a Maybe Demand
We don't need Nothing (to indicate no strictness/demand info)
any more; topSig/topDmd will do.
* Remove "boxity" analysis entirely. This was an attempt to
avoid "reboxing", but it added complexity, is extremely
ad-hoc, and makes very little difference in practice.
* Remove the "unboxing strategy" computation. This was an an
attempt to ensure that a worker didn't get zillions of
arguments by unboxing big tuples. But in fact removing it
DRAMATICALLY reduces allocation in an inner loop of the
I/O library (where the threshold argument-count had been
set just too low). It's exceptional to have a zillion arguments
and I don't think it's worth the complexity, especially since
it turned out to have a serious performance hit.
* Remove quite a bit of ad-hoc cruft
* Move worthSplittingFun, worthSplittingThunk from WorkWrap to
Demand. This allows JointDmd to be fully abstract, examined
only inside Demand.
Everything else really follows from these changes.
All of this is really just refactoring, so we don't expect
big performance changes, but acutally the numbers look quite
good. Here is a full nofib run with some highlights identified:
Program Size Allocs Runtime Elapsed TotalMem
--------------------------------------------------------------------------------
expert -2.6% -15.5% 0.00 0.00 +0.0%
fluid -2.4% -7.1% 0.01 0.01 +0.0%
gg -2.5% -28.9% 0.02 0.02 -33.3%
integrate -2.6% +3.2% +2.6% +2.6% +0.0%
mandel2 -2.6% +4.2% 0.01 0.01 +0.0%
nucleic2 -2.0% -16.3% 0.11 0.11 +0.0%
para -2.6% -20.0% -11.8% -11.7% +0.0%
parser -2.5% -17.9% 0.05 0.05 +0.0%
prolog -2.6% -13.0% 0.00 0.00 +0.0%
puzzle -2.6% +2.2% +0.8% +0.8% +0.0%
sorting -2.6% -35.9% 0.00 0.00 +0.0%
treejoin -2.6% -52.2% -9.8% -9.9% +0.0%
--------------------------------------------------------------------------------
Min -2.7% -52.2% -11.8% -11.7% -33.3%
Max -1.8% +4.2% +10.5% +10.5% +7.7%
Geometric Mean -2.5% -2.8% -0.4% -0.5% -0.4%
Things to note
* Binary sizes are smaller. I don't know why, but it's good.
* Allocation is sometiemes a *lot* smaller. I believe that all the big numbers
(I checked treejoin, gg, sorting) arise from one place, namely a function
GHC.IO.Encoding.UTF8.utf8_decode, which is strict in two Buffers both of
which have several arugments. Not w/w'ing both arguments (which is what
we did before) has a big effect. So the big win in actually somewhat
accidental, gained by removing the "unboxing strategy" code.
* A couple of benchmarks allocate slightly more. This turns out
to be due to reboxing (integrate). But the biggest increase is
mandel2, and *that* turned out also to be a somewhat accidental
loss of CSE, and pointed the way to doing better CSE: see Trac
#7596.
* Runtimes are never very reliable, but seem to improve very slightly.
All in all, a good piece of work. Thank you Ilya!
|
|
|
|
|
| |
There's no need to have the uniq in the rule, but its presence can
cause spurious ABI changes.
|
| |
|
|
|
|
|
|
|
|
| |
I also removed the default values from the "Discounts and thresholds"
note: most of them were no longer up-to-date.
Along the way I added FloatSuffix to the argument parser, analogous to
IntSuffix.
|
|
|
|
|
|
|
| |
First, make Rules.match_co able to deal wit some modest coercions
Second, make SpecConstr use wild-card for coercion arguments
This is the rest of the fix for Trac #7165
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch allows, for the first time, case expressions with an empty
list of alternatives. Max suggested the idea, and Trac #6067 showed
that it is really quite important.
So I've implemented the idea, fixing #6067. Main changes
* See Note [Empty case alternatives] in CoreSyn
* Various foldr1's become foldrs
* IfaceCase does not record the type of the alternatives.
I added IfaceECase for empty-alternative cases.
* Core Lint does not complain about empty cases
* MkCore.castBottomExpr constructs an empty-alternative case
expression (case e of ty {})
* CoreToStg converts '(case e of {})' to just 'e'
|
|
|
|
|
|
|
| |
...and make sure it is, esp in the call to findAlt in
the mighty Simplifier. Failing to check this led to
searching a bunch of DataAlts for a LitAlt Integer.
Naughty. See Trac #5603 for a case in point.
|
| |
|
|
|
|
|
| |
We only use it for "compiler" sources, i.e. not for libraries.
Many modules have a -fno-warn-tabs kludge for now.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
User visible changes
====================
Profilng
--------
Flags renamed (the old ones are still accepted for now):
OLD NEW
--------- ------------
-auto-all -fprof-auto
-auto -fprof-exported
-caf-all -fprof-cafs
New flags:
-fprof-auto Annotates all bindings (not just top-level
ones) with SCCs
-fprof-top Annotates just top-level bindings with SCCs
-fprof-exported Annotates just exported bindings with SCCs
-fprof-no-count-entries Do not maintain entry counts when profiling
(can make profiled code go faster; useful with
heap profiling where entry counts are not used)
Cost-centre stacks have a new semantics, which should in most cases
result in more useful and intuitive profiles. If you find this not to
be the case, please let me know. This is the area where I have been
experimenting most, and the current solution is probably not the
final version, however it does address all the outstanding bugs and
seems to be better than GHC 7.2.
Stack traces
------------
+RTS -xc now gives more information. If the exception originates from
a CAF (as is common, because GHC tends to lift exceptions out to the
top-level), then the RTS walks up the stack and reports the stack in
the enclosing update frame(s).
Result: +RTS -xc is much more useful now - but you still have to
compile for profiling to get it. I've played around a little with
adding 'head []' to GHC itself, and +RTS -xc does pinpoint the problem
quite accurately.
I plan to add more facilities for stack tracing (e.g. in GHCi) in the
future.
Coverage (HPC)
--------------
* derived instances are now coloured yellow if they weren't used
* likewise record field names
* entry counts are more accurate (hpc --fun-entry-count)
* tab width is now correct (markup was previously off in source with
tabs)
Internal changes
================
In Core, the Note constructor has been replaced by
Tick (Tickish b) (Expr b)
which is used to represent all the kinds of source annotation we
support: profiling SCCs, HPC ticks, and GHCi breakpoints.
Depending on the properties of the Tickish, different transformations
apply to Tick. See CoreUtils.mkTick for details.
Tickets
=======
This commit closes the following tickets, test cases to follow:
- Close #2552: not a bug, but the behaviour is now more intuitive
(test is T2552)
- Close #680 (test is T680)
- Close #1531 (test is result001)
- Close #949 (test is T949)
- Close #2466: test case has bitrotted (doesn't compile against current
version of vector-space package)
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Conflicts:
compiler/iface/BuildTyCl.lhs
compiler/iface/MkIface.lhs
compiler/iface/TcIface.lhs
compiler/typecheck/TcTyClsDecls.lhs
compiler/types/Class.lhs
compiler/utils/Util.lhs
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Basically as documented in http://hackage.haskell.org/trac/ghc/wiki/KindFact,
this patch adds a new kind Constraint such that:
Show :: * -> Constraint
(?x::Int) :: Constraint
(Int ~ a) :: Constraint
And you can write *any* type with kind Constraint to the left of (=>):
even if that type is a type synonym, type variable, indexed type or so on.
The following (somewhat related) changes are also made:
1. We now box equality evidence. This is required because we want
to give (Int ~ a) the *lifted* kind Constraint
2. For similar reasons, implicit parameters can now only be of
a lifted kind. (?x::Int#) => ty is now ruled out
3. Implicit parameter constraints are now allowed in superclasses
and instance contexts (this just falls out as OK with the new
constraint solver)
Internally the following major changes were made:
1. There is now no PredTy in the Type data type. Instead
GHC checks the kind of a type to figure out if it is a predicate
2. There is now no AClass TyThing: we represent classes as TyThings
just as a ATyCon (classes had TyCons anyway)
3. What used to be (~) is now pretty-printed as (~#). The box
constructor EqBox :: (a ~# b) -> (a ~ b)
4. The type LCoercion is used internally in the constraint solver
and type checker to represent coercions with free variables
of type (a ~ b) rather than (a ~# b)
|
|/ |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This was making Text.PrettyPrint.HughesPJ give a lint-bug
when the libraries were compiled with -O2.
It's all caused by phantom type synonyms (which are, generally
speaking, a royal pain). The fix is simple, but a bit brutal.
See Note [Free type variables of the qvar types].
|
|
|
|
|
|
| |
These turn out to be a useful special case of splitTyConApp_maybe.
A refactoring only; no change in behaviour
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the offending message:
SpecConstr
Function `$wks2{v s2dJ} [lid]'
has one call pattern, but the limit is 0
Use -fspec-constr-count=n to set the bound
Use -dppr-debug to see specialisations
The message isn't very good, and is for experts only. So now it
comes out only
if you build with -DDEBUG
or you specify -dppr-debug at runtime
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
See the paper "Practical aspects of evidence based compilation in System FC"
* Coercion becomes a data type, distinct from Type
* Coercions become value-level things, rather than type-level things,
(although the value is zero bits wide, like the State token)
A consequence is that a coerion abstraction increases the arity by 1
(just like a dictionary abstraction)
* There is a new constructor in CoreExpr, namely Coercion, to inject
coercions into terms
|
|
|
|
|
|
|
| |
Well, more a plain bug really, which led to SpecConstr
missing some obvious opportunities for specialisation.
Thanks to Max Bolingbroke for spotting this.
|
|
|
|
|
|
|
|
|
|
|
| |
There was a terrible typo in this patch; I wrote "env"
instead of "env1".
Mon Jan 31 11:35:29 GMT 2011 simonpj@microsoft.com
* Improve Simplifier and SpecConstr behaviour
Anyway, this fix is essential to make it work properly.
Thanks to Max for spotting the problem (again).
|
|
|
|
|
|
|
|
| |
This was originally to improve the case when SpecConstr generated a
function with an unused argument (see Trac #4941), but I ended up
giving up on that. But the refactoring is still an improvement.
In particular I got rid of BothOcc, which was unused.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Trac #4908 identified a case where SpecConstr wasn't "seeing" a
specialisation it should easily get. The solution was simple: see
Note [Add scrutinee to ValueEnv too] in SpecConstr.
Then it turned out that there was an exactly analogous infelicity in
the mighty Simplifer too; see Note [Add unfolding for scrutinee] in
Simplify. This fix is good for Simplify even in the absence of the
SpecConstr change. (It arose when I moved the binder- swap stuff to
OccAnall, not realising that it *remains* valuable to record info
about the scrutinee of a case expression. The Note says why.
Together these two changes are unconditionally good. Better
simplification, better specialisation. Thank you Max.
|
|
|
|
|
| |
This makes sure that join points are fully specialised in loops which are
marked as ForceSpecConstr.
|
| |
|
| |
|
|
|
|
| |
scrutinised
|