| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
| |
The 'id' type is now determined by the pass, using the XTickishId
type family.
|
|
|
|
|
|
|
|
| |
GHCi needs to know the types of all breakpoints, but it's
not possible to get the exprType of any expression in STG.
This is preparation for the upcoming change to make GHCi
bytecode from STG instead of Core.
|
| |
|
|
|
|
|
|
|
|
| |
When we use `withTiming` we need to force the results of each timed pass
to better represent the time spent in each phase. This patch forces
some results that weren't before.
It also retrieve timings for the CoreToStg and WriteIface passes.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before this patch, the only way to override GHC's default logging
behavior was to set `log_action`, `dump_action` and `trace_action`
fields in DynFlags. This patch introduces a new Logger abstraction and
stores it in HscEnv instead.
This is part of #17957 (avoid storing state in DynFlags). DynFlags are
duplicated and updated per-module (because of OPTIONS_GHC pragma), so
we shouldn't store global state in them.
This patch also fixes a race in parallel "--make" mode which updated
the `generatedDumps` IORef concurrently.
Bump haddock submodule
The increase in MultilayerModules is tracked in #19293.
Metric Increase:
MultiLayerModules
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From `Note [Speculative evaluation]`:
Since call-by-value is much cheaper than call-by-need, we case-bind
arguments that are either
1. Strictly evaluated anyway, according to the StrictSig of the
callee, or
2. ok-for-spec, according to 'exprOkForSpeculation'
While (1) is a no-brainer and always beneficial, (2) is a bit
more subtle, as the careful haddock for 'exprOkForSpeculation'
points out. Still, by case-binding the argument we don't need
to allocate a thunk for it, whose closure must be retained as
long as the callee might evaluate it. And if it is evaluated on
most code paths anyway, we get to turn the unknown eval in the
callee into a known call at the call site.
NoFib Results:
```
--------------------------------------------------------------------------------
Program Allocs Instrs
--------------------------------------------------------------------------------
ansi -9.4% -10.4%
maillist -0.1% -0.1%
paraffins -0.7% -0.5%
scc -0.0% +0.1%
treejoin -0.0% -0.1%
--------------------------------------------------------------------------------
Min -9.4% -10.4%
Max 0.0% +0.1%
Geometric Mean -0.1% -0.1%
```
Fixes #19224.
|
|
|
|
|
|
| |
Alt, AnnAlt and IfaceAlt were using triples. This patch makes them use
dedicated types so that we can try to make some fields strict (for
example) in the future.
|
|
|
|
|
|
|
|
|
|
| |
This patch makes the desugarer rewrite
noinline (f d) --> noinline f d
This makes 'noinline' much more reliable: see #18995
It's explained in the improved Note [noinlineId magic]
in GHC.Types.Id.Make
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As outlined in #18903, interleaving usage and strictness demands not
only means a more compact demand representation, but also allows us to
express demands that we weren't easily able to express before.
Call demands are *relative* in the sense that a call demand `Cn(cd)`
on `g` says "`g` is called `n` times. *Whenever `g` is called*, the
result is used according to `cd`". Example from #18903:
```hs
h :: Int -> Int
h m =
let g :: Int -> (Int,Int)
g 1 = (m, 0)
g n = (2 * n, 2 `div` n)
{-# NOINLINE g #-}
in case m of
1 -> 0
2 -> snd (g m)
_ -> uncurry (+) (g m)
```
Without the interleaved representation, we would just get `L` for the
strictness demand on `g`. Now we are able to express that whenever
`g` is called, its second component is used strictly in denoting `g`
by `1C1(P(1P(U),SP(U)))`. This would allow Nested CPR to unbox the
division, for example.
Fixes #18903.
While fixing regressions, I also discovered and fixed #18957.
Metric Decrease:
T13253-spj
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I was working on making DynFlags stateless (#17957), especially by
storing loaded plugins into HscEnv instead of DynFlags. It turned out to
be complicated because HscEnv is in GHC.Driver.Types but LoadedPlugin
isn't: it is in GHC.Driver.Plugins which depends on GHC.Driver.Types. I
didn't feel like introducing yet another hs-boot file to break the loop.
Additionally I remember that while we introduced the module hierarchy
(#13009) we talked about splitting GHC.Driver.Types because it contained
various unrelated types and functions, but we never executed. I didn't
feel like making GHC.Driver.Types bigger with more unrelated Plugins
related types, so finally I bit the bullet and split GHC.Driver.Types.
As a consequence this patch moves a lot of things. I've tried to put
them into appropriate modules but nothing is set in stone.
Several other things moved to avoid loops.
* Removed Binary instances from GHC.Utils.Binary for random compiler
things
* Moved Typeable Binary instances into GHC.Utils.Binary.Typeable: they
import a lot of things that users of GHC.Utils.Binary don't want to
depend on.
* put everything related to Units/Modules under GHC.Unit:
GHC.Unit.Finder, GHC.Unit.Module.{ModGuts,ModIface,Deps,etc.}
* Created several modules under GHC.Types: GHC.Types.Fixity, SourceText,
etc.
* Split GHC.Utils.Error (into GHC.Types.Error)
* Finally removed GHC.Driver.Types
Note that this patch doesn't put loaded plugins into HscEnv. It's left
for another patch.
Bump haddock submodule
|
| |
|
|
|
|
|
| |
Instead of recreating the HomeUnit from the DynFlags every time we need
it, we store it in the HscEnv.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we had a very aggressive Core Lint check which caught
unsaturated applications of runRW#. However, there is nothing
wrong with such applications and they may naturally arise in desugared
Core. For instance, the desugared Core of Data.Primitive.Array.runArray#
from the `primitive` package contains:
case ($) (runRW# @_ @_) (\s -> ...) of ...
In this case it's almost certain that ($) will be inlined, turning the
application into a saturated application. However, even if this weren't
the case there isn't a problem: CorePrep (after deleting an unnecessary
case) can simply generate code in its usual way, resulting in a call to
the Haskell definition of runRW#.
Fixes #18291.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since Backpack the "home unit" is much more involved than what it was
before (just an identifier obtained with `-this-unit-id`). Now it is
used in conjunction with `-component-id` and `-instantiated-with` to
configure module instantiations and to detect if we are type-checking an
indefinite unit or compiling a definite one.
This patch introduces a new HomeUnit datatype which is much easier to
understand. Moreover to make GHC support several packages in the same
instances, we will need to handle several HomeUnits so having a
dedicated (documented) type is helpful.
Finally in #14335 we will also need to handle the case where we have no
HomeUnit at all because we are only loading existing interfaces for
plugins which live in a different space compared to units used to
produce target code. Several functions will have to be refactored to
accept "Maybe HomeUnit" parameters instead of implicitly querying the
HomeUnit fields in DynFlags. Having a dedicated type will make this
easier.
Bump haddock submodule
|
|
|
|
|
|
|
|
|
| |
- put panic related functions into GHC.Utils.Panic
- put trace related functions using DynFlags in GHC.Driver.Ppr
One step closer making Outputable fully independent of DynFlags.
Bump haddock submodule
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Thanks to ghc-bignum, the compiler can be simplified:
* Types and constructors of Integer and Natural can be wired-in. It
means that we don't have to query them from interfaces. It also means
that numeric literals don't have to carry their type with them.
* The same code is used whatever ghc-bignum backend is enabled. In
particular, conversion of bignum literals into final Core expressions
is now much more straightforward. Bignum closure inspection too.
* GHC itself doesn't depend on any integer-* package anymore
* The `integerLibrary` setting is gone.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This implements several general performance improvements to GHC,
to offset the effect of the linear types change.
General optimisations:
- Add a `coreFullView` function which iterates `coreView` on the
head. This avoids making function recursive solely because the
iterate `coreView` themselves. As a consequence, this functions can
be inlined, and trigger case-of-known constructor (_e.g._
`kindRep_maybe`, `isLiftedRuntimeRep`, `isMultiplicityTy`,
`getTyVar_maybe`, `splitAppTy_maybe`, `splitFunType_maybe`,
`tyConAppTyCon_maybe`). The common pattern about all these functions
is that they are almost always used as views, and immediately
consumed by a case expression. This commit also mark them asx `INLINE`.
- In `subst_ty` add a special case for nullary `TyConApp`, which avoid
allocations altogether.
- Use `mkTyConApp` in `subst_ty` for the general `TyConApp`. This
required quite a bit of module shuffling.
case. `myTyConApp` enforces crucial sharing, which was lost during
substitution. See also !2952 .
- Make `subst_ty` stricter.
- In `eqType` (specifically, in `nonDetCmpType`), add a special case,
tested first, for the very common case of nullary `TyConApp`.
`nonDetCmpType` has been made `INLINE` otherwise it is actually a
regression. This is similar to the optimisations in !2952.
Linear-type specific optimisations:
- Use `tyConAppTyCon_maybe` instead of the more complex `eqType` in
the definition of the pattern synonyms `One` and `Many`.
- Break the `hs-boot` cycles between `Multiplicity.hs` and `Type.hs`:
`Multiplicity` now import `Type` normally, rather than from the
`hs-boot`. This way `tyConAppTyCon_maybe` can inline properly in the
`One` and `Many` pattern synonyms.
- Make `updateIdTypeAndMult` strict in its type and multiplicity
- The `scaleIdBy` gets a specialised definition rather than being an
alias to `scaleVarBy`
- `splitFunTy_maybe` is given the type `Type -> Maybe (Mult, Type,
Type)` instead of `Type -> Maybe (Scaled Type, Type)`
- Remove the `MultMul` pattern synonym in favour of a view `isMultMul`
because pattern synonyms appear not to inline well.
- in `eqType`, in a `FunTy`, compare multiplicities last: they are
almost always both `Many`, so it helps failing faster.
- Cache `manyDataConTy` in `mkTyConApp`, to make sure that all the
instances of `TyConApp ManyDataConTy []` are physically the same.
This commit has been authored by
* Richard Eisenberg
* Krzysztof Gogolewski
* Arnaud Spiwack
Metric Decrease:
haddock.base
T12227
T12545
T12990
T1969
T3064
T5030
T9872b
Metric Increase:
haddock.base
haddock.Cabal
haddock.compiler
T12150
T12234
T12425
T12707
T13035
T13056
T15164
T16190
T18304
T1969
T3064
T3294
T5631
T5642
T5837
T6048
T9020
T9233
T9675
T9872a
T9961
WWRec
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the first step towards implementation of the linear types proposal
(https://github.com/ghc-proposals/ghc-proposals/pull/111).
It features
* A language extension -XLinearTypes
* Syntax for linear functions in the surface language
* Linearity checking in Core Lint, enabled with -dlinear-core-lint
* Core-to-core passes are mostly compatible with linearity
* Fields in a data type can be linear or unrestricted; linear fields
have multiplicity-polymorphic constructors.
If -XLinearTypes is disabled, the GADT syntax defaults to linear fields
The following items are not yet supported:
* a # m -> b syntax (only prefix FUN is supported for now)
* Full multiplicity inference (multiplicities are really only checked)
* Decent linearity error messages
* Linear let, where, and case expressions in the surface language
(each of these currently introduce the unrestricted variant)
* Multiplicity-parametric fields
* Syntax for annotating lambda-bound or let-bound with a multiplicity
* Syntax for non-linear/multiple-field-multiplicity records
* Linear projections for records with a single linear field
* Linear pattern synonyms
* Multiplicity coercions (test LinearPolyType)
A high-level description can be found at
https://ghc.haskell.org/trac/ghc/wiki/LinearTypes/Implementation
Following the link above you will find a description of the changes made to Core.
This commit has been authored by
* Richard Eisenberg
* Krzysztof Gogolewski
* Matthew Pickering
* Arnaud Spiwack
With contributions from:
* Mark Barbone
* Alexander Vershilov
Updates haddock submodule.
|
|
|
|
|
|
|
|
| |
* use UnitId instead of String to identify wired-in units
* use UnitId instead of Unit in the backend (Unit are only use by
Backpack to produce type-checked interfaces, not real code)
* rename lookup functions for consistency
* documentation
|
|
|
|
|
| |
* rename thisPackage into homeUnit
* document and refactor several Backpack things
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The cast worker/wrapper transformation transforms
x = e |> co
into
y = e
x = y |> co
This is done by the simplifier, but we were being
careless about transferring IdInfo from x to y,
and about what to do if x is a NOINLNE function.
This resulted in a series of bugs:
#17673, #18093, #18078.
This patch fixes all that:
* Main change is in GHC.Core.Opt.Simplify, and
the new prepareBinding function, which does this
cast worker/wrapper transform.
See Note [Cast worker/wrappers].
* There is quite a bit of refactoring around
prepareRhs, makeTrivial etc. It's nicer now.
* Some wrappers from strictness and cast w/w, notably those for
a function with a NOINLINE, should inline very late. There
wasn't really a mechanism for that, which was an existing bug
really; so I invented a new finalPhase = Phase (-1). It's used
for all simplifier runs after the user-visible phase 2,1,0 have
run. (No new runs of the simplifier are introduced thereby.)
See new Note [Compiler phases] in GHC.Types.Basic;
the main changes are in GHC.Core.Opt.Driver
* Doing this made me trip over two places where the AnonArgFlag on a
FunTy was being lost so we could end up with (Num a -> ty)
rather than (Num a => ty)
- In coercionLKind/coercionRKind
- In contHoleType in the Simplifier
I fixed the former by defining mkFunctionType and using it in
coercionLKind/RKind.
I could have done the same for the latter, but the information
is almost to hand. So I fixed the latter by
- adding sc_hole_ty to ApplyToVal (like ApplyToTy),
- adding as_hole_ty to ValArg (like TyArg)
- adding sc_fun_ty to StrictArg
Turned out I could then remove ai_type from ArgInfo. This is
just moving the deck chairs around, but it worked out nicely.
See the new Note [AnonArgFlag] in GHC.Types.Var
* When looking at the 'arity decrease' thing (#18093) I discovered
that stable unfoldings had a much lower arity than the actual
optimised function. That's what led to the arity-decrease
message. Simple solution: eta-expand.
It's described in Note [Eta-expand stable unfoldings]
in GHC.Core.Opt.Simplify
* I also discovered that unsafeCoerce wasn't being inlined if
the context was boring. So (\x. f (unsafeCoerce x)) would
create a thunk -- yikes! I fixed that by making inlineBoringOK
a bit cleverer: see Note [Inline unsafeCoerce] in GHC.Core.Unfold.
I also found that unsafeCoerceName was unused, so I removed it.
I made a test case for #18078, and a very similar one for #17673.
The net effect of all this on nofib is very modest, but positive:
--------------------------------------------------------------------------------
Program Size Allocs Runtime Elapsed TotalMem
--------------------------------------------------------------------------------
anna -0.4% -0.1% -3.1% -3.1% 0.0%
fannkuch-redux -0.4% -0.3% -0.1% -0.1% 0.0%
maillist -0.4% -0.1% -7.8% -1.0% -14.3%
primetest -0.4% -15.6% -7.1% -6.6% 0.0%
--------------------------------------------------------------------------------
Min -0.9% -15.6% -13.3% -14.2% -14.3%
Max -0.3% 0.0% +12.1% +12.4% 0.0%
Geometric Mean -0.4% -0.2% -2.3% -2.2% -0.1%
All following metric decreases are compile-time allocation decreases
between -1% and -3%:
Metric Decrease:
T5631
T13701
T14697
T15164
|
|
|
|
|
|
|
|
|
|
|
| |
Now since we no longer try to predict CAFfyness we have no need for the
solution to #16846. Eta expanding unsaturated primop applications is
conceptually simpler, especially in the presence of levity polymorphism.
This essentially reverts cac8dc9f51e31e4c0a6cd9bc302f7e1bc7c03beb,
as suggested in #18079.
Closes #18079.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Because runRW# inlines so late, we were previously able to do very
little simplification across it. For instance, given even a simple
program like
case runRW# (\s -> let n = I# 42# in n) of
I# n# -> f n#
we previously had no way to avoid the allocation of the I#.
This patch allows the simplifier to push strict contexts into the
continuation of a runRW# application, as explained in
in Note [Simplification of runRW#] in GHC.CoreToStg.Prep.
Fixes #15127.
Metric Increase:
T9961
Metric Decrease:
ManyConstructors
Co-Authored-By: Simon Peyton-Jone <simonpj@microsoft.com>
|
| |
|
| |
|
|
|
|
|
|
|
| |
Introduce GHC.Unit.* hierarchy for everything concerning units, packages
and modules.
Update Haddock submodule
|
|
|
|
|
|
|
| |
Update Haddock submodule
Metric Increase:
haddock.compiler
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* SysTools
* Parser
* GHC.Builtin
* GHC.Iface.Recomp
* Settings
Update Haddock submodule
Metric Decrease:
Naperian
parsing001
|
|
|
|
|
|
|
|
|
|
|
| |
* GHC.Core.Op => GHC.Core.Opt
* GHC.Core.Opt.Simplify.Driver => GHC.Core.Opt.Driver
* GHC.Core.Opt.Tidy => GHC.Core.Tidy
* GHC.Core.Opt.WorkWrap.Lib => GHC.Core.Opt.WorkWrap.Utils
As discussed in:
* https://mail.haskell.org/pipermail/ghc-devs/2020-April/018758.html
* https://gitlab.haskell.org/ghc/ghc/issues/13009#note_264650
|
|
|
|
| |
Update Haddock submodule
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The binder-swap transformation is implemented by the occurrence
analyser -- see Note [Binder swap] in OccurAnal. However it had
a very nasty corner in it, for the case where the case scrutinee
was a GlobalId. This led to trouble and hacks, and ultimately
to #16296.
This patch re-engineers how the occurrence analyser implements
the binder-swap, by actually carrying out a substitution rather
than by adding a let-binding. It's all described in
Note [The binder-swap substitution].
I did a few other things along the way
* Fix a bug in StgCse, which could allow a loop breaker to be CSE'd
away. See Note [Care with loop breakers] in StgCse. I think it can
only show up if occurrence analyser sets up bad loop breakers, but
still.
* Better commenting in SimplUtils.prepareAlts
* A little refactoring in CoreUnfold; nothing significant
e.g. rename CoreUnfold.mkTopUnfolding to mkFinalUnfolding
* Renamed CoreSyn.isFragileUnfolding to hasCoreUnfolding
* Move mkRuleInfo to CoreFVs
We observed respectively 4.6% and 5.9% allocation decreases for the following
tests:
Metric Decrease:
T9961
haddock.base
|
|
|
|
|
|
|
| |
Update Haddock submodule
Metric Increase:
haddock.compiler
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use Platform instead of DynFlags when possible:
* `tARGET_MIN_INT` et al. replaced with `platformMinInt` et al.
* no more DynFlags in PreRules: added a new `RuleOpts` datatype
* don't use `wORD_SIZE` in the compiler
* make `wordAlignment` use `Platform`
* make `dOUBLE_SIZE` a constant
Metric Decrease:
T13035
T1969
|
| |
|
|
|
|
| |
Update submodule: haddock
|
|
|
|
|
|
|
|
| |
Should make `member` queries faster and avoid messing up with missing
`nubSort`.
Metric Increase:
hie002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When GHC is parsing a file generated by a tool, e.g. by the C preprocessor, the
tool may insert #line pragmas to adjust the locations reported to the user.
As the result, the locations recorded in RealSrcLoc are not monotonic. Elements
that appear later in the StringBuffer are not guaranteed to have a higher
line/column number.
In fact, there are no guarantees whatsoever, as #line pragmas can arbitrarily
modify locations. This lack of guarantees makes ideas such as #17544
infeasible.
This patch adds an additional bit of information to every SrcLoc:
newtype BufPos = BufPos { bufPos :: Int }
A BufPos represents the location in the StringBuffer, unaffected by any
pragmas.
Updates haddock submodule.
Metric Increase:
haddock.Cabal
haddock.base
haddock.compiler
MultiLayerModules
Naperian
parsing001
T12150
|
|
|
|
| |
Update haddock submodule
|
|
|
|
| |
submodule updates: nofib, haddock
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(Commit message written by Omer, most of the code is written by Simon
and Richard)
See Note [Implementing unsafeCoerce] for how unsafe equality proofs and
the new unsafeCoerce# are implemented.
New notes added:
- [Checking for levity polymorphism] in CoreLint.hs
- [Implementing unsafeCoerce] in base/Unsafe/Coerce.hs
- [Patching magic definitions] in Desugar.hs
- [Wiring in unsafeCoerce#] in Desugar.hs
Only breaking change in this patch is unsafeCoerce# is not exported from
GHC.Exts, instead of GHC.Prim.
Fixes #17443
Fixes #16893
NoFib
-----
--------------------------------------------------------------------------------
Program Size Allocs Instrs Reads Writes
--------------------------------------------------------------------------------
CS -0.1% 0.0% -0.0% -0.0% -0.0%
CSD -0.1% 0.0% -0.0% -0.0% -0.0%
FS -0.1% 0.0% -0.0% -0.0% -0.0%
S -0.1% 0.0% -0.0% -0.0% -0.0%
VS -0.1% 0.0% -0.0% -0.0% -0.0%
VSD -0.1% 0.0% -0.0% -0.0% -0.1%
VSM -0.1% 0.0% -0.0% -0.0% -0.0%
anna -0.0% 0.0% -0.0% -0.0% -0.0%
ansi -0.1% 0.0% -0.0% -0.0% -0.0%
atom -0.1% 0.0% -0.0% -0.0% -0.0%
awards -0.1% 0.0% -0.0% -0.0% -0.0%
banner -0.1% 0.0% -0.0% -0.0% -0.0%
bernouilli -0.1% 0.0% -0.0% -0.0% -0.0%
binary-trees -0.1% 0.0% -0.0% -0.0% -0.0%
boyer -0.1% 0.0% -0.0% -0.0% -0.0%
boyer2 -0.1% 0.0% -0.0% -0.0% -0.0%
bspt -0.1% 0.0% -0.0% -0.0% -0.0%
cacheprof -0.1% 0.0% -0.0% -0.0% -0.0%
calendar -0.1% 0.0% -0.0% -0.0% -0.0%
cichelli -0.1% 0.0% -0.0% -0.0% -0.0%
circsim -0.1% 0.0% -0.0% -0.0% -0.0%
clausify -0.1% 0.0% -0.0% -0.0% -0.0%
comp_lab_zift -0.1% 0.0% -0.0% -0.0% -0.0%
compress -0.1% 0.0% -0.0% -0.0% -0.0%
compress2 -0.1% 0.0% -0.0% -0.0% -0.0%
constraints -0.1% 0.0% -0.0% -0.0% -0.0%
cryptarithm1 -0.1% 0.0% -0.0% -0.0% -0.0%
cryptarithm2 -0.1% 0.0% -0.0% -0.0% -0.0%
cse -0.1% 0.0% -0.0% -0.0% -0.0%
digits-of-e1 -0.1% 0.0% -0.0% -0.0% -0.0%
digits-of-e2 -0.1% 0.0% -0.0% -0.0% -0.0%
dom-lt -0.1% 0.0% -0.0% -0.0% -0.0%
eliza -0.1% 0.0% -0.0% -0.0% -0.0%
event -0.1% 0.0% -0.0% -0.0% -0.0%
exact-reals -0.1% 0.0% -0.0% -0.0% -0.0%
exp3_8 -0.1% 0.0% -0.0% -0.0% -0.0%
expert -0.1% 0.0% -0.0% -0.0% -0.0%
fannkuch-redux -0.1% 0.0% -0.0% -0.0% -0.0%
fasta -0.1% 0.0% -0.5% -0.3% -0.4%
fem -0.1% 0.0% -0.0% -0.0% -0.0%
fft -0.1% 0.0% -0.0% -0.0% -0.0%
fft2 -0.1% 0.0% -0.0% -0.0% -0.0%
fibheaps -0.1% 0.0% -0.0% -0.0% -0.0%
fish -0.1% 0.0% -0.0% -0.0% -0.0%
fluid -0.1% 0.0% -0.0% -0.0% -0.0%
fulsom -0.1% 0.0% +0.0% +0.0% +0.0%
gamteb -0.1% 0.0% -0.0% -0.0% -0.0%
gcd -0.1% 0.0% -0.0% -0.0% -0.0%
gen_regexps -0.1% 0.0% -0.0% -0.0% -0.0%
genfft -0.1% 0.0% -0.0% -0.0% -0.0%
gg -0.1% 0.0% -0.0% -0.0% -0.0%
grep -0.1% 0.0% -0.0% -0.0% -0.0%
hidden -0.1% 0.0% -0.0% -0.0% -0.0%
hpg -0.1% 0.0% -0.0% -0.0% -0.0%
ida -0.1% 0.0% -0.0% -0.0% -0.0%
infer -0.1% 0.0% -0.0% -0.0% -0.0%
integer -0.1% 0.0% -0.0% -0.0% -0.0%
integrate -0.1% 0.0% -0.0% -0.0% -0.0%
k-nucleotide -0.1% 0.0% -0.0% -0.0% -0.0%
kahan -0.1% 0.0% -0.0% -0.0% -0.0%
knights -0.1% 0.0% -0.0% -0.0% -0.0%
lambda -0.1% 0.0% -0.0% -0.0% -0.0%
last-piece -0.1% 0.0% -0.0% -0.0% -0.0%
lcss -0.1% 0.0% -0.0% -0.0% -0.0%
life -0.1% 0.0% -0.0% -0.0% -0.0%
lift -0.1% 0.0% -0.0% -0.0% -0.0%
linear -0.1% 0.0% -0.0% -0.0% -0.0%
listcompr -0.1% 0.0% -0.0% -0.0% -0.0%
listcopy -0.1% 0.0% -0.0% -0.0% -0.0%
maillist -0.1% 0.0% -0.0% -0.0% -0.0%
mandel -0.1% 0.0% -0.0% -0.0% -0.0%
mandel2 -0.1% 0.0% -0.0% -0.0% -0.0%
mate -0.1% 0.0% -0.0% -0.0% -0.0%
minimax -0.1% 0.0% -0.0% -0.0% -0.0%
mkhprog -0.1% 0.0% -0.0% -0.0% -0.0%
multiplier -0.1% 0.0% -0.0% -0.0% -0.0%
n-body -0.1% 0.0% -0.0% -0.0% -0.0%
nucleic2 -0.1% 0.0% -0.0% -0.0% -0.0%
para -0.1% 0.0% -0.0% -0.0% -0.0%
paraffins -0.1% 0.0% -0.0% -0.0% -0.0%
parser -0.1% 0.0% -0.0% -0.0% -0.0%
parstof -0.1% 0.0% -0.0% -0.0% -0.0%
pic -0.1% 0.0% -0.0% -0.0% -0.0%
pidigits -0.1% 0.0% -0.0% -0.0% -0.0%
power -0.1% 0.0% -0.0% -0.0% -0.0%
pretty -0.1% 0.0% -0.1% -0.1% -0.1%
primes -0.1% 0.0% -0.0% -0.0% -0.0%
primetest -0.1% 0.0% -0.0% -0.0% -0.0%
prolog -0.1% 0.0% -0.0% -0.0% -0.0%
puzzle -0.1% 0.0% -0.0% -0.0% -0.0%
queens -0.1% 0.0% -0.0% -0.0% -0.0%
reptile -0.1% 0.0% -0.0% -0.0% -0.0%
reverse-complem -0.1% 0.0% -0.0% -0.0% -0.0%
rewrite -0.1% 0.0% -0.0% -0.0% -0.0%
rfib -0.1% 0.0% -0.0% -0.0% -0.0%
rsa -0.1% 0.0% -0.0% -0.0% -0.0%
scc -0.1% 0.0% -0.1% -0.1% -0.1%
sched -0.1% 0.0% -0.0% -0.0% -0.0%
scs -0.1% 0.0% -0.0% -0.0% -0.0%
simple -0.1% 0.0% -0.0% -0.0% -0.0%
solid -0.1% 0.0% -0.0% -0.0% -0.0%
sorting -0.1% 0.0% -0.0% -0.0% -0.0%
spectral-norm -0.1% 0.0% -0.0% -0.0% -0.0%
sphere -0.1% 0.0% -0.0% -0.0% -0.0%
symalg -0.1% 0.0% -0.0% -0.0% -0.0%
tak -0.1% 0.0% -0.0% -0.0% -0.0%
transform -0.1% 0.0% -0.0% -0.0% -0.0%
treejoin -0.1% 0.0% -0.0% -0.0% -0.0%
typecheck -0.1% 0.0% -0.0% -0.0% -0.0%
veritas -0.0% 0.0% -0.0% -0.0% -0.0%
wang -0.1% 0.0% -0.0% -0.0% -0.0%
wave4main -0.1% 0.0% -0.0% -0.0% -0.0%
wheel-sieve1 -0.1% 0.0% -0.0% -0.0% -0.0%
wheel-sieve2 -0.1% 0.0% -0.0% -0.0% -0.0%
x2n1 -0.1% 0.0% -0.0% -0.0% -0.0%
--------------------------------------------------------------------------------
Min -0.1% 0.0% -0.5% -0.3% -0.4%
Max -0.0% 0.0% +0.0% +0.0% +0.0%
Geometric Mean -0.1% -0.0% -0.0% -0.0% -0.0%
Test changes
------------
- break006 is marked as broken, see #17833
- The compiler allocates less when building T14683 (an unsafeCoerce#-
heavy happy-generated code) on 64-platforms. Allocates more on 32-bit
platforms.
- Rest of the increases are tiny amounts (still enough to pass the
threshold) in micro-benchmarks. I briefly looked at each one in a
profiling build: most of the increased allocations seem to be because
of random changes in the generated code.
Metric Decrease:
T14683
Metric Increase:
T12150
T12234
T12425
T13035
T14683
T5837
T6048
Co-Authored-By: Richard Eisenberg <rae@cs.brynmawr.edu>
Co-Authored-By: Ömer Sinan Ağacan <omeragacan@gmail.com>
|
|
|
|
| |
Update haddock submodule
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch removes all CafInfo predictions and various hacks to preserve
predicted CafInfos from the compiler and assigns final CafInfos to
interface Ids after code generation. SRT analysis is extended to support
static data, and Cmm generator is modified to allow generating
static_link fields after SRT analysis.
This also fixes `-fcatch-bottoms`, which introduces error calls in case
expressions in CorePrep, which runs *after* CoreTidy (which is where we
decide on CafInfos) and turns previously non-CAFFY things into CAFFY.
Fixes #17648
Fixes #9718
Evaluation
==========
NoFib
-----
Boot with: `make boot mode=fast`
Run: `make mode=fast EXTRA_RUNTEST_OPTS="-cachegrind" NoFibRuns=1`
--------------------------------------------------------------------------------
Program Size Allocs Instrs Reads Writes
--------------------------------------------------------------------------------
CS -0.0% 0.0% -0.0% -0.0% -0.0%
CSD -0.0% 0.0% -0.0% -0.0% -0.0%
FS -0.0% 0.0% -0.0% -0.0% -0.0%
S -0.0% 0.0% -0.0% -0.0% -0.0%
VS -0.0% 0.0% -0.0% -0.0% -0.0%
VSD -0.0% 0.0% -0.0% -0.0% -0.5%
VSM -0.0% 0.0% -0.0% -0.0% -0.0%
anna -0.1% 0.0% -0.0% -0.0% -0.0%
ansi -0.0% 0.0% -0.0% -0.0% -0.0%
atom -0.0% 0.0% -0.0% -0.0% -0.0%
awards -0.0% 0.0% -0.0% -0.0% -0.0%
banner -0.0% 0.0% -0.0% -0.0% -0.0%
bernouilli -0.0% 0.0% -0.0% -0.0% -0.0%
binary-trees -0.0% 0.0% -0.0% -0.0% -0.0%
boyer -0.0% 0.0% -0.0% -0.0% -0.0%
boyer2 -0.0% 0.0% -0.0% -0.0% -0.0%
bspt -0.0% 0.0% -0.0% -0.0% -0.0%
cacheprof -0.0% 0.0% -0.0% -0.0% -0.0%
calendar -0.0% 0.0% -0.0% -0.0% -0.0%
cichelli -0.0% 0.0% -0.0% -0.0% -0.0%
circsim -0.0% 0.0% -0.0% -0.0% -0.0%
clausify -0.0% 0.0% -0.0% -0.0% -0.0%
comp_lab_zift -0.0% 0.0% -0.0% -0.0% -0.0%
compress -0.0% 0.0% -0.0% -0.0% -0.0%
compress2 -0.0% 0.0% -0.0% -0.0% -0.0%
constraints -0.0% 0.0% -0.0% -0.0% -0.0%
cryptarithm1 -0.0% 0.0% -0.0% -0.0% -0.0%
cryptarithm2 -0.0% 0.0% -0.0% -0.0% -0.0%
cse -0.0% 0.0% -0.0% -0.0% -0.0%
digits-of-e1 -0.0% 0.0% -0.0% -0.0% -0.0%
digits-of-e2 -0.0% 0.0% -0.0% -0.0% -0.0%
dom-lt -0.0% 0.0% -0.0% -0.0% -0.0%
eliza -0.0% 0.0% -0.0% -0.0% -0.0%
event -0.0% 0.0% -0.0% -0.0% -0.0%
exact-reals -0.0% 0.0% -0.0% -0.0% -0.0%
exp3_8 -0.0% 0.0% -0.0% -0.0% -0.0%
expert -0.0% 0.0% -0.0% -0.0% -0.0%
fannkuch-redux -0.0% 0.0% -0.0% -0.0% -0.0%
fasta -0.0% 0.0% -0.0% -0.0% -0.0%
fem -0.0% 0.0% -0.0% -0.0% -0.0%
fft -0.0% 0.0% -0.0% -0.0% -0.0%
fft2 -0.0% 0.0% -0.0% -0.0% -0.0%
fibheaps -0.0% 0.0% -0.0% -0.0% -0.0%
fish -0.0% 0.0% -0.0% -0.0% -0.0%
fluid -0.1% 0.0% -0.0% -0.0% -0.0%
fulsom -0.0% 0.0% -0.0% -0.0% -0.0%
gamteb -0.0% 0.0% -0.0% -0.0% -0.0%
gcd -0.0% 0.0% -0.0% -0.0% -0.0%
gen_regexps -0.0% 0.0% -0.0% -0.0% -0.0%
genfft -0.0% 0.0% -0.0% -0.0% -0.0%
gg -0.0% 0.0% -0.0% -0.0% -0.0%
grep -0.0% 0.0% -0.0% -0.0% -0.0%
hidden -0.0% 0.0% -0.0% -0.0% -0.0%
hpg -0.1% 0.0% -0.0% -0.0% -0.0%
ida -0.0% 0.0% -0.0% -0.0% -0.0%
infer -0.0% 0.0% -0.0% -0.0% -0.0%
integer -0.0% 0.0% -0.0% -0.0% -0.0%
integrate -0.0% 0.0% -0.0% -0.0% -0.0%
k-nucleotide -0.0% 0.0% -0.0% -0.0% -0.0%
kahan -0.0% 0.0% -0.0% -0.0% -0.0%
knights -0.0% 0.0% -0.0% -0.0% -0.0%
lambda -0.0% 0.0% -0.0% -0.0% -0.0%
last-piece -0.0% 0.0% -0.0% -0.0% -0.0%
lcss -0.0% 0.0% -0.0% -0.0% -0.0%
life -0.0% 0.0% -0.0% -0.0% -0.0%
lift -0.0% 0.0% -0.0% -0.0% -0.0%
linear -0.1% 0.0% -0.0% -0.0% -0.0%
listcompr -0.0% 0.0% -0.0% -0.0% -0.0%
listcopy -0.0% 0.0% -0.0% -0.0% -0.0%
maillist -0.0% 0.0% -0.0% -0.0% -0.0%
mandel -0.0% 0.0% -0.0% -0.0% -0.0%
mandel2 -0.0% 0.0% -0.0% -0.0% -0.0%
mate -0.0% 0.0% -0.0% -0.0% -0.0%
minimax -0.0% 0.0% -0.0% -0.0% -0.0%
mkhprog -0.0% 0.0% -0.0% -0.0% -0.0%
multiplier -0.0% 0.0% -0.0% -0.0% -0.0%
n-body -0.0% 0.0% -0.0% -0.0% -0.0%
nucleic2 -0.0% 0.0% -0.0% -0.0% -0.0%
para -0.0% 0.0% -0.0% -0.0% -0.0%
paraffins -0.0% 0.0% -0.0% -0.0% -0.0%
parser -0.1% 0.0% -0.0% -0.0% -0.0%
parstof -0.1% 0.0% -0.0% -0.0% -0.0%
pic -0.0% 0.0% -0.0% -0.0% -0.0%
pidigits -0.0% 0.0% -0.0% -0.0% -0.0%
power -0.0% 0.0% -0.0% -0.0% -0.0%
pretty -0.0% 0.0% -0.3% -0.4% -0.4%
primes -0.0% 0.0% -0.0% -0.0% -0.0%
primetest -0.0% 0.0% -0.0% -0.0% -0.0%
prolog -0.0% 0.0% -0.0% -0.0% -0.0%
puzzle -0.0% 0.0% -0.0% -0.0% -0.0%
queens -0.0% 0.0% -0.0% -0.0% -0.0%
reptile -0.0% 0.0% -0.0% -0.0% -0.0%
reverse-complem -0.0% 0.0% -0.0% -0.0% -0.0%
rewrite -0.0% 0.0% -0.0% -0.0% -0.0%
rfib -0.0% 0.0% -0.0% -0.0% -0.0%
rsa -0.0% 0.0% -0.0% -0.0% -0.0%
scc -0.0% 0.0% -0.3% -0.5% -0.4%
sched -0.0% 0.0% -0.0% -0.0% -0.0%
scs -0.0% 0.0% -0.0% -0.0% -0.0%
simple -0.1% 0.0% -0.0% -0.0% -0.0%
solid -0.0% 0.0% -0.0% -0.0% -0.0%
sorting -0.0% 0.0% -0.0% -0.0% -0.0%
spectral-norm -0.0% 0.0% -0.0% -0.0% -0.0%
sphere -0.0% 0.0% -0.0% -0.0% -0.0%
symalg -0.0% 0.0% -0.0% -0.0% -0.0%
tak -0.0% 0.0% -0.0% -0.0% -0.0%
transform -0.0% 0.0% -0.0% -0.0% -0.0%
treejoin -0.0% 0.0% -0.0% -0.0% -0.0%
typecheck -0.0% 0.0% -0.0% -0.0% -0.0%
veritas -0.0% 0.0% -0.0% -0.0% -0.0%
wang -0.0% 0.0% -0.0% -0.0% -0.0%
wave4main -0.0% 0.0% -0.0% -0.0% -0.0%
wheel-sieve1 -0.0% 0.0% -0.0% -0.0% -0.0%
wheel-sieve2 -0.0% 0.0% -0.0% -0.0% -0.0%
x2n1 -0.0% 0.0% -0.0% -0.0% -0.0%
--------------------------------------------------------------------------------
Min -0.1% 0.0% -0.3% -0.5% -0.5%
Max -0.0% 0.0% -0.0% -0.0% -0.0%
Geometric Mean -0.0% -0.0% -0.0% -0.0% -0.0%
--------------------------------------------------------------------------------
Program Size Allocs Instrs Reads Writes
--------------------------------------------------------------------------------
circsim -0.1% 0.0% -0.0% -0.0% -0.0%
constraints -0.0% 0.0% -0.0% -0.0% -0.0%
fibheaps -0.0% 0.0% -0.0% -0.0% -0.0%
gc_bench -0.0% 0.0% -0.0% -0.0% -0.0%
hash -0.0% 0.0% -0.0% -0.0% -0.0%
lcss -0.0% 0.0% -0.0% -0.0% -0.0%
power -0.0% 0.0% -0.0% -0.0% -0.0%
spellcheck -0.0% 0.0% -0.0% -0.0% -0.0%
--------------------------------------------------------------------------------
Min -0.1% 0.0% -0.0% -0.0% -0.0%
Max -0.0% 0.0% -0.0% -0.0% -0.0%
Geometric Mean -0.0% +0.0% -0.0% -0.0% -0.0%
Manual inspection of programs in testsuite/tests/programs
---------------------------------------------------------
I built these programs with a bunch of dump flags and `-O` and compared
STG, Cmm, and Asm dumps and file sizes.
(Below the numbers in parenthesis show number of modules in the program)
These programs have identical compiler (same .hi and .o sizes, STG, and
Cmm and Asm dumps):
- Queens (1), andre_monad (1), cholewo-eval (2), cvh_unboxing (3),
andy_cherry (7), fun_insts (1), hs-boot (4), fast2haskell (2),
jl_defaults (1), jq_readsPrec (1), jules_xref (1), jtod_circint (4),
jules_xref2 (1), lennart_range (1), lex (1), life_space_leak (1),
bargon-mangler-bug (7), record_upd (1), rittri (1), sanders_array (1),
strict_anns (1), thurston-module-arith (2), okeefe_neural (1),
joao-circular (6), 10queens (1)
Programs with different compiler outputs:
- jl_defaults (1): For some reason GHC HEAD marks a lot of top-level
`[Int]` closures as CAFFY for no reason. With this patch we no longer
make them CAFFY and generate less SRT entries. For some reason Main.o
is slightly larger with this patch (1.3%) and the executable sizes are
the same. (I'd expect both to be smaller)
- launchbury (1): Same as jl_defaults: top-level `[Int]` closures marked
as CAFFY for no reason. Similarly `Main.o` is 1.4% larger but the
executable sizes are the same.
- galois_raytrace (13): Differences are in the Parse module. There are a
lot, but some of the changes are caused by the fact that for some
reason (I think a bug) GHC HEAD marks the dictionary for `Functor
Identity` as CAFFY. Parse.o is 0.4% larger, the executable size is the
same.
- north_array: We now generate less SRT entries because some of array
primops used in this program like `NewArrayOp` get eliminated during
Stg-to-Cmm and turn some CAFFY things into non-CAFFY. Main.o gets 24%
larger (9224 bytes from 9000 bytes), executable sizes are the same.
- seward-space-leak: Difference in this program is better shown by this
smaller example:
module Lib where
data CDS
= Case [CDS] [(Int, CDS)]
| Call CDS CDS
instance Eq CDS where
Case sels1 rets1 == Case sels2 rets2 =
sels1 == sels2 && rets1 == rets2
Call a1 b1 == Call a2 b2 =
a1 == a2 && b1 == b2
_ == _ =
False
In this program GHC HEAD builds a new SRT for the recursive group of
`(==)`, `(/=)` and the dictionary closure. Then `/=` points to `==`
in its SRT field, and `==` uses the SRT object as its SRT. With this
patch we use the closure for `/=` as the SRT and add `==` there. Then
`/=` gets an empty SRT field and `==` points to `/=` in its SRT
field.
This change looks fine to me.
Main.o gets 0.07% larger, executable sizes are identical.
head.hackage
------------
head.hackage's CI script builds 428 packages from Hackage using this
patch with no failures.
Compiler performance
--------------------
The compiler perf tests report that the compiler allocates slightly more
(worst case observed so far is 4%). However most programs in the test
suite are small, single file programs. To benchmark compiler performance
on something more realistic I build Cabal (the library, 236 modules)
with different optimisation levels. For the "max residency" row I run
GHC with `+RTS -s -A100k -i0 -h` for more accurate numbers. Other rows
are generated with just `-s`. (This is because `-i0` causes running GC
much more frequently and as a result "bytes copied" gets inflated by
more than 25x in some cases)
* -O0
| | GHC HEAD | This MR | Diff |
| --------------- | -------------- | -------------- | ------ |
| Bytes allocated | 54,413,350,872 | 54,701,099,464 | +0.52% |
| Bytes copied | 4,926,037,184 | 4,990,638,760 | +1.31% |
| Max residency | 421,225,624 | 424,324,264 | +0.73% |
* -O1
| | GHC HEAD | This MR | Diff |
| --------------- | --------------- | --------------- | ------ |
| Bytes allocated | 245,849,209,992 | 246,562,088,672 | +0.28% |
| Bytes copied | 26,943,452,560 | 27,089,972,296 | +0.54% |
| Max residency | 982,643,440 | 991,663,432 | +0.91% |
* -O2
| | GHC HEAD | This MR | Diff |
| --------------- | --------------- | --------------- | ------ |
| Bytes allocated | 291,044,511,408 | 291,863,910,912 | +0.28% |
| Bytes copied | 37,044,237,616 | 36,121,690,472 | -2.49% |
| Max residency | 1,071,600,328 | 1,086,396,256 | +1.38% |
Extra compiler allocations
--------------------------
Runtime allocations of programs are as reported above (NoFib section).
The compiler now allocates more than before. Main source of allocation
in this patch compared to base commit is the new SRT algorithm
(GHC.Cmm.Info.Build). Below is some of the extra work we do with this
patch, numbers generated by profiled stage 2 compiler when building a
pathological case (the test 'ManyConstructors') with '-O2':
- We now sort the final STG for a module, which means traversing the
entire program, generating free variable set for each top-level
binding, doing SCC analysis, and re-ordering the program. In
ManyConstructors this step allocates 97,889,952 bytes.
- We now do SRT analysis on static data, which in a program like
ManyConstructors causes analysing 10,000 bindings that we would
previously just skip. This step allocates 70,898,352 bytes.
- We now maintain an SRT map for the entire module as we compile Cmm
groups:
data ModuleSRTInfo = ModuleSRTInfo
{ ...
, moduleSRTMap :: SRTMap
}
(SRTMap is just a strict Map from the 'containers' library)
This map gets an entry for most bindings in a module (exceptions are
THUNKs and CAFFY static functions). For ManyConstructors this map
gets 50015 entries.
- Once we're done with code generation we generate a NameSet from SRTMap
for the non-CAFFY names in the current module. This set gets the same
number of entries as the SRTMap.
- Finally we update CafInfos in ModDetails for the non-CAFFY Ids, using
the NameSet generated in the previous step. This usually does the
least amount of allocation among the work listed here.
Only place with this patch where we do less work in the CAF analysis in
the tidying pass (CoreTidy). However that doesn't save us much, as the
pass still needs to traverse the whole program and update IdInfos for
other reasons. Only thing we don't here do is the `hasCafRefs` pass over
the RHS of bindings, which is a stateless pass that returns a boolean
value, so it doesn't allocate much.
(Metric changes blow are all increased allocations)
Metric changes
--------------
Metric Increase:
ManyAlternatives
ManyConstructors
T13035
T14683
T1969
T9961
|
|
|
|
|
|
| |
incomplete-uni-patterns and incomplete-record-updates will be in -Wall at a
future date, so prepare for that by disabling those warnings on files that
trigger them.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements overloaded quotation brackets which generalise the
desugaring of all quotation forms in terms of a new minimal interface.
The main change is that a quotation, for example, [e| 5 |], will now
have type `Quote m => m Exp` rather than `Q Exp`. The `Quote` typeclass
contains a single method for generating new names which is used when
desugaring binding structures.
The return type of functions from the `Lift` type class, `lift` and `liftTyped` have
been restricted to `forall m . Quote m => m Exp` rather than returning a
result in a Q monad.
More details about the feature can be read in the GHC proposal.
https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0246-overloaded-bracket.rst
|
| |
|
|
|