diff options
author | Simon Peyton Jones <simonpj@microsoft.com> | 2013-01-17 10:54:07 +0000 |
---|---|---|
committer | Simon Peyton Jones <simonpj@microsoft.com> | 2013-01-17 10:54:07 +0000 |
commit | 0831a12ea2fc73c33652eeec1adc79fa19700578 (patch) | |
tree | 6382f3cd4cb7070d101e22d7de2876aa8cbbbc39 /compiler/prelude | |
parent | aef38d130b0ff74b0a5f2478be985e96b40c0f97 (diff) | |
download | haskell-0831a12ea2fc73c33652eeec1adc79fa19700578.tar.gz |
Major patch to implement the new Demand Analyser
This patch is the result of Ilya Sergey's internship at MSR. It
constitutes a thorough overhaul and simplification of the demand
analyser. It makes a solid foundation on which we can now build.
Main changes are
* Instead of having one combined type for Demand, a Demand is
now a pair (JointDmd) of
- a StrDmd and
- an AbsDmd.
This allows strictness and absence to be though about quite
orthogonally, and greatly reduces brain melt-down.
* Similarly in the DmdResult type, it's a pair of
- a PureResult (indicating only divergence/non-divergence)
- a CPRResult (which deals only with the CPR property
* In IdInfo, the
strictnessInfo field contains a StrictSig, not a Maybe StrictSig
demandInfo field contains a Demand, not a Maybe Demand
We don't need Nothing (to indicate no strictness/demand info)
any more; topSig/topDmd will do.
* Remove "boxity" analysis entirely. This was an attempt to
avoid "reboxing", but it added complexity, is extremely
ad-hoc, and makes very little difference in practice.
* Remove the "unboxing strategy" computation. This was an an
attempt to ensure that a worker didn't get zillions of
arguments by unboxing big tuples. But in fact removing it
DRAMATICALLY reduces allocation in an inner loop of the
I/O library (where the threshold argument-count had been
set just too low). It's exceptional to have a zillion arguments
and I don't think it's worth the complexity, especially since
it turned out to have a serious performance hit.
* Remove quite a bit of ad-hoc cruft
* Move worthSplittingFun, worthSplittingThunk from WorkWrap to
Demand. This allows JointDmd to be fully abstract, examined
only inside Demand.
Everything else really follows from these changes.
All of this is really just refactoring, so we don't expect
big performance changes, but acutally the numbers look quite
good. Here is a full nofib run with some highlights identified:
Program Size Allocs Runtime Elapsed TotalMem
--------------------------------------------------------------------------------
expert -2.6% -15.5% 0.00 0.00 +0.0%
fluid -2.4% -7.1% 0.01 0.01 +0.0%
gg -2.5% -28.9% 0.02 0.02 -33.3%
integrate -2.6% +3.2% +2.6% +2.6% +0.0%
mandel2 -2.6% +4.2% 0.01 0.01 +0.0%
nucleic2 -2.0% -16.3% 0.11 0.11 +0.0%
para -2.6% -20.0% -11.8% -11.7% +0.0%
parser -2.5% -17.9% 0.05 0.05 +0.0%
prolog -2.6% -13.0% 0.00 0.00 +0.0%
puzzle -2.6% +2.2% +0.8% +0.8% +0.0%
sorting -2.6% -35.9% 0.00 0.00 +0.0%
treejoin -2.6% -52.2% -9.8% -9.9% +0.0%
--------------------------------------------------------------------------------
Min -2.7% -52.2% -11.8% -11.7% -33.3%
Max -1.8% +4.2% +10.5% +10.5% +7.7%
Geometric Mean -2.5% -2.8% -0.4% -0.5% -0.4%
Things to note
* Binary sizes are smaller. I don't know why, but it's good.
* Allocation is sometiemes a *lot* smaller. I believe that all the big numbers
(I checked treejoin, gg, sorting) arise from one place, namely a function
GHC.IO.Encoding.UTF8.utf8_decode, which is strict in two Buffers both of
which have several arugments. Not w/w'ing both arguments (which is what
we did before) has a big effect. So the big win in actually somewhat
accidental, gained by removing the "unboxing strategy" code.
* A couple of benchmarks allocate slightly more. This turns out
to be due to reboxing (integrate). But the biggest increase is
mandel2, and *that* turned out also to be a somewhat accidental
loss of CSE, and pointed the way to doing better CSE: see Trac
#7596.
* Runtimes are never very reliable, but seem to improve very slightly.
All in all, a good piece of work. Thank you Ilya!
Diffstat (limited to 'compiler/prelude')
-rw-r--r-- | compiler/prelude/primops.txt.pp | 10 |
1 files changed, 5 insertions, 5 deletions
diff --git a/compiler/prelude/primops.txt.pp b/compiler/prelude/primops.txt.pp index 77236a1727..6d551d90e5 100644 --- a/compiler/prelude/primops.txt.pp +++ b/compiler/prelude/primops.txt.pp @@ -45,10 +45,9 @@ defaults can_fail = False -- See Note Note [PrimOp can_fail and has_side_effects] in PrimOp commutable = False code_size = { primOpCodeSizeDefault } - strictness = { \ arity -> mkStrictSig (mkTopDmdType (replicate arity lazyDmd) TopRes) } + strictness = { \ arity -> mkStrictSig (mkTopDmdType (replicate arity topDmd) topRes) } fixity = Nothing - -- Currently, documentation is produced using latex, so contents of -- description fields should be legal latex. Descriptions can contain -- matched pairs of embedded curly brackets. @@ -1530,7 +1529,7 @@ primop CatchOp "catch#" GenPrimOp primop RaiseOp "raise#" GenPrimOp a -> b with - strictness = { \ _arity -> mkStrictSig (mkTopDmdType [lazyDmd] BotRes) } + strictness = { \ _arity -> mkStrictSig (mkTopDmdType [topDmd] botRes) } -- NB: result is bottom out_of_line = True @@ -1547,7 +1546,7 @@ primop RaiseOp "raise#" GenPrimOp primop RaiseIOOp "raiseIO#" GenPrimOp a -> State# RealWorld -> (# State# RealWorld, b #) with - strictness = { \ _arity -> mkStrictSig (mkTopDmdType [lazyDmd,lazyDmd] BotRes) } + strictness = { \ _arity -> mkStrictSig (mkTopDmdType [topDmd, topDmd] botRes) } out_of_line = True has_side_effects = True @@ -2028,7 +2027,8 @@ section "Tag to enum stuff" primop DataToTagOp "dataToTag#" GenPrimOp a -> Int# with - strictness = { \ _arity -> mkStrictSig (mkTopDmdType [seqDmd] TopRes) } + strictness = { \ _arity -> mkStrictSig (mkTopDmdType [evalDmd] topRes) } + -- dataToTag# must have an evaluated argument primop TagToEnumOp "tagToEnum#" GenPrimOp |