| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
This distills the essence of the Sigs.hs program found in the ticket.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In #19644, we discovered that the ClassOp/DFun rules from
Note [ClassOp/DFun selection] inhibit transitive specialisation in a scenario
like
```
class C a where m :: Show b => a -> b -> ...; n :: ...
instance C Int where m = ... -- $cm :: Show b => Int -> b -> ...
f :: forall a b. (C a, Show b) => ...
f $dC $dShow = ... m @a $dC @b $dShow ...
main = ... f @Int @Bool ...
```
After we specialise `f` for `Int`, we'll see `m @a $dC @b $dShow` in the body of
`$sf`. But before this patch, Specialise doesn't apply the ClassOp/DFun rule to
rewrite to a call of the instance method for `C Int`, e.g., `$cm @Bool $dShow`.
As a result, Specialise couldn't further specialise `$cm` for `Bool`.
There's a better example in `Note [Specialisation modulo dictionary selectors]`.
This patch enables proper Specialisation, as follows:
1. In the App case of `specExpr`, try to apply the CalssOp/DictSel rule on the
head of the application
2. Attach an unfolding to freshly-bound dictionary ids such as `$dC` and
`$dShow` in `bindAuxiliaryDict`
NB: Without (2), (1) would be pointless, because `lookupRule` wouldn't be able
to look into the RHS of `$dC` to see the DFun.
(2) triggered #21332, because the Specialiser floats around dictionaries without
accounting for them in the `SpecEnv`'s `InScopeSet`, triggering a panic when
rewriting dictionary unfoldings.
Fixes #19644 and #21332.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit implements proposal 302: \cases - Multi-way lambda
expressions.
This adds a new expression heralded by \cases, which works exactly like
\case, but can match multiple apats instead of a single pat.
Updates submodule haddock to support the ITlcases token.
Closes #20768
|
|
|
|
|
|
|
|
|
|
|
| |
- Remove unused functions exprToCoercion_maybe, applyTypeToArg,
typeMonoPrimRep_maybe, runtimeRepMonoPrimRep_maybe.
- Replace orValid with a simpler check
- Use splitAtList in applyTysX
- Remove calls to extra_clean in the testsuite; it does not do anything.
Metric Decrease:
T18223
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rewrite the critical notes and fix outdated ones,
use `HsQuote GhcRn` (in `HsBracketTc`) for desugaring regardless of the
bracket being typed or untyped,
remove unused `EpAnn` from `Hs*Bracket GhcRn`,
zonkExpr factor out common brackets code,
ppr_expr factor out common brackets code,
and fix tests,
to finish MR https://gitlab.haskell.org/ghc/ghc/-/merge_requests/4782.
-------------------------
Metric Decrease:
hard_hole_fits
-------------------------
|
|
|
|
|
|
|
|
|
|
| |
* Users can define their own (~) type operator
* Haddock can display documentation for the built-in (~)
* New transitional warnings implemented:
-Wtype-equality-out-of-scope
-Wtype-equality-requires-operators
Updates the haddock submodule.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduces a new kind of metavariable, by adding the
constructor `ConcreteTv` to `MetaInfo`. A metavariable with
`ConcreteTv` `MetaInfo`, henceforth a concrete metavariable, can only
be unified with a type that is concrete (that is, a type that answers
`True` to `GHC.Core.Type.isConcrete`).
This solves the problem of dangling metavariables in `Concrete#`
constraints: instead of emitting `Concrete# ty`, which contains a
secret existential metavariable, we simply emit a primitive equality
constraint `ty ~# concrete_tv` where `concrete_tv` is a fresh concrete
metavariable.
This means we can avoid all the complexity of canonicalising
`Concrete#` constraints, as we can just re-use the existing machinery
for `~#`.
To finish things up, this patch then removes the `Concrete#` special
predicate, and instead introduces the special predicate `IsRefl#`
which enforces that a coercion is reflexive.
Such a constraint is needed because the canonicaliser is quite happy
to rewrite an equality constraint such as `ty ~# concrete_tv`, but
such a rewriting is not handled by the rest of the compiler currently,
as we need to make use of the resulting coercion, as outlined in the
FixedRuntimeRep plan.
The big upside of this approach (on top of simplifying the code)
is that we can now selectively implement PHASE 2 of FixedRuntimeRep,
by changing individual calls of `hasFixedRuntimeRep_MustBeRefl` to
`hasFixedRuntimeRep` and making use of the obtained coercion.
|
|
|
|
|
|
|
|
|
| |
This test was taking too long to run, so this patch makes it smaller.
-------------------------
Metric Decrease:
LargeRecord
-------------------------
|
|
|
|
|
|
|
|
| |
This patch adds some performance tests for programs that create
large coercions. This is useful because the existing test coverage
is not very representative of real-world situations. In particular,
this adds a test involving an extensible records library, a common
pain-point for users.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Here we introduce a new data structure, RoughMap, inspired by the
previous `RoughTc` matching mechanism for checking instance matches.
This allows [Fam]InstEnv to be implemented as a trie indexed by these
RoughTc signatures, reducing the complexity of instance lookup and
FamInstEnv merging (done during the family instance conflict test)
from O(n) to O(log n).
The critical performance improvement currently realised by this patch is
in instance matching. In particular the RoughMap mechanism allows us to
discount many potential instances which will never match for constraints
involving type variables (see Note [Matching a RoughMap]). In realistic
code bases matchInstEnv was accounting for 50% of typechecker time due
to redundant work checking instances when simplifying instance contexts
when deriving instances. With this patch the cost is significantly
reduced.
The larger constants in InstEnv creation do mean that a few small
tests regress in allocations slightly. However, the runtime of T19703 is
reduced by a factor of 4. Moreover, the compilation time of the Cabal
library is slightly improved.
A couple of test cases are included which demonstrate significant
improvements in compile time with this patch.
This unfortunately does not fix the testcase provided in #19703 but does
fix #20933
-------------------------
Metric Decrease:
T12425
Metric Increase:
T13719
T9872a
T9872d
hard_hole_fits
-------------------------
Co-authored-by: Matthew Pickering <matthewtpickering@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This test has been the scourge of contributors for a long time.
It has caused many failed CI runs and wasted hours debugging a test
which barely does anything. The fact is does nothing is the reason for
the flakiness and it's very sensitive to small changes in initialisation costs,
in particular adding wired-in things can cause this test to fluctuate
quite a bit.
Therefore we admit defeat and just bump the threshold up to 10% to catch
very large regressions but otherwise don't care what this test does.
Fixes #19414
|
|
|
|
|
|
|
|
| |
The test now passes on OpenBSD instead of generating broken source
which was rejected by GHC with
ManyAlternatives.hs:5:1: error:
The type signature for ‘f’ lacks an accompanying binding
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Multiple home units allows you to load different packages which may depend on
each other into one GHC session. This will allow both GHCi and HLS to support
multi component projects more naturally.
Public Interface
~~~~~~~~~~~~~~~~
In order to specify multiple units, the -unit @⟨filename⟩ flag
is given multiple times with a response file containing the arguments for each unit.
The response file contains a newline separated list of arguments.
```
ghc -unit @unitLibCore -unit @unitLib
```
where the `unitLibCore` response file contains the normal arguments that cabal would pass to `--make` mode.
```
-this-unit-id lib-core-0.1.0.0
-i
-isrc
LibCore.Utils
LibCore.Types
```
The response file for lib, can specify a dependency on lib-core, so then modules in lib can use modules from lib-core.
```
-this-unit-id lib-0.1.0.0
-package-id lib-core-0.1.0.0
-i
-isrc
Lib.Parse
Lib.Render
```
Then when the compiler starts in --make mode it will compile both units lib and lib-core.
There is also very basic support for multiple home units in GHCi, at the
moment you can start a GHCi session with multiple units but only the
:reload is supported. Most commands in GHCi assume a single home unit,
and so it is additional work to work out how to modify the interface to
support multiple loaded home units.
Options used when working with Multiple Home Units
There are a few extra flags which have been introduced specifically for
working with multiple home units. The flags allow a home unit to pretend
it’s more like an installed package, for example, specifying the package
name, module visibility and reexported modules.
-working-dir ⟨dir⟩
It is common to assume that a package is compiled in the directory
where its cabal file resides. Thus, all paths used in the compiler
are assumed to be relative to this directory. When there are
multiple home units the compiler is often not operating in the
standard directory and instead where the cabal.project file is
located. In this case the -working-dir option can be passed which
specifies the path from the current directory to the directory the
unit assumes to be it’s root, normally the directory which contains
the cabal file.
When the flag is passed, any relative paths used by the compiler are
offset by the working directory. Notably this includes -i and
-I⟨dir⟩ flags.
-this-package-name ⟨name⟩
This flag papers over the awkward interaction of the PackageImports
and multiple home units. When using PackageImports you can specify
the name of the package in an import to disambiguate between modules
which appear in multiple packages with the same name.
This flag allows a home unit to be given a package name so that you
can also disambiguate between multiple home units which provide
modules with the same name.
-hidden-module ⟨module name⟩
This flag can be supplied multiple times in order to specify which
modules in a home unit should not be visible outside of the unit it
belongs to.
The main use of this flag is to be able to recreate the difference
between an exposed and hidden module for installed packages.
-reexported-module ⟨module name⟩
This flag can be supplied multiple times in order to specify which
modules are not defined in a unit but should be reexported. The
effect is that other units will see this module as if it was defined
in this unit.
The use of this flag is to be able to replicate the reexported
modules feature of packages with multiple home units.
Offsetting Paths in Template Haskell splices
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When using Template Haskell to embed files into your program,
traditionally the paths have been interpreted relative to the directory
where the .cabal file resides. This causes problems for multiple home
units as we are compiling many different libraries at once which have
.cabal files in different directories.
For this purpose we have introduced a way to query the value of the
-working-dir flag to the Template Haskell API. By using this function we
can implement a makeRelativeToProject function which offsets a path
which is relative to the original project root by the value of
-working-dir.
```
import Language.Haskell.TH.Syntax ( makeRelativeToProject )
foo = $(makeRelativeToProject "./relative/path" >>= embedFile)
```
> If you write a relative path in a Template Haskell splice you should use the makeRelativeToProject function so that your library works correctly with multiple home units.
A similar function already exists in the file-embed library. The
function in template-haskell implements this function in a more robust
manner by honouring the -working-dir flag rather than searching the file
system.
Closure Property for Home Units
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
For tools or libraries using the API there is one very important closure
property which must be adhered to:
> Any dependency which is not a home unit must not (transitively) depend
on a home unit.
For example, if you have three packages p, q and r, then if p depends on
q which depends on r then it is illegal to load both p and r as home
units but not q, because q is a dependency of the home unit p which
depends on another home unit r.
If you are using GHC by the command line then this property is checked,
but if you are using the API then you need to check this property
yourself. If you get it wrong you will probably get some very confusing
errors about overlapping instances.
Limitations of Multiple Home Units
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
There are a few limitations of the initial implementation which will be smoothed out on user demand.
* Package thinning/renaming syntax is not supported
* More complicated reexports/renaming are not yet supported.
* It’s more common to run into existing linker bugs when loading a
large number of packages in a session (for example #20674, #20689)
* Backpack is not yet supported when using multiple home units.
* Dependency chasing can be quite slow with a large number of
modules and packages.
* Loading wired-in packages as home units is currently not supported
(this only really affects GHC developers attempting to load
template-haskell).
* Barely any normal GHCi features are supported, it would be good to
support enough for ghcid to work correctly.
Despite these limitations, the implementation works already for nearly
all packages. It has been testing on large dependency closures,
including the whole of head.hackage which is a total of 4784 modules
from 452 packages.
Internal Changes
~~~~~~~~~~~~~~~~
* The biggest change is that the HomePackageTable is replaced with the
HomeUnitGraph. The HomeUnitGraph is a map from UnitId to HomeUnitEnv,
which contains information specific to each home unit.
* The HomeUnitEnv contains:
- A unit state, each home unit can have different package db flags
- A set of dynflags, each home unit can have different flags
- A HomePackageTable
* LinkNode: A new node type is added to the ModuleGraph, this is used to
place the linking step into the build plan so linking can proceed in
parralel with other packages being built.
* New invariant: Dependencies of a ModuleGraphNode can be completely
determined by looking at the value of the node. In order to achieve
this, downsweep now performs a more complete job of downsweeping and
then the dependenices are recorded forever in the node rather than
being computed again from the ModSummary.
* Some transitive module calculations are rewritten to use the
ModuleGraph which is more efficient.
* There is always an active home unit, which simplifies modifying a lot
of the existing API code which is unit agnostic (for example, in the
driver).
The road may be bumpy for a little while after this change but the
basics are well-tested.
One small metric increase, which we accept and also submodule update to
haddock which removes ExtendedModSummary.
Closes #10827
-------------------------
Metric Increase:
MultiLayerModules
-------------------------
Co-authored-by: Fendor <power.walross@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The reqlib modifer was supposed to indicate that a test needed a certain
library in order to work. If the library happened to be installed then
the test would run as normal.
However, CI has never run these tests as the packages have not been
installed and we don't want out tests to depend on things which might
get externally broken by updating the compiler.
The new strategy is to run these tests in head.hackage, where the tests
have been cabalised as well as possible. Some tests couldn't be
transferred into the normal style testsuite but it's better than never
running any of the reqlib tests. https://gitlab.haskell.org/ghc/head.hackage/-/merge_requests/169
A few submodules also had reqlib tests and have been updated to remove
it.
Closes #16264 #20032 #17764 #16561
|
|
|
|
| |
We use the parser generated by stack to ensure reproducibility
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit bddecda1a4c96da21e3f5211743ce5e4c78793a2.
This implements the first step in the plan formulated in #20025 to
improve the communication and migration strategy for the proposed
changes to Data.List.
Requires changing the haddock submodule to update the test output.
|
|
|
|
| |
Fixes #20621
|
|
|
|
|
|
| |
This test triggers the bad code path identified by #20509 where an entry
into the EPS caused by importing Control.Applicative will retain a stale
HomePackageTable.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch enables worker/wrapper for nested constructed products, as described
in `Note [Nested CPR]`. The machinery for expressing Nested CPR was already
there, since !5054. Worker/wrapper is equipped to exploit Nested CPR annotations
since !5338. CPR analysis already handles applications in batches since !5753.
This patch just needs to flip a few more switches:
1. In `cprTransformDataConWork`, we need to look at the field expressions
and their `CprType`s to see whether the evaluation of the expressions
terminates quickly (= is in HNF) or if they are put in strict fields.
If that is the case, then we retain their CPR info and may unbox nestedly
later on. More details in `Note [Nested CPR]`.
2. Enable nested `ConCPR` signatures in `GHC.Types.Cpr`.
3. In the `asConCpr` call in `GHC.Core.Opt.WorkWrap.Utils`, pass CPR info of
fields to the `Unbox`.
4. Instead of giving CPR signatures to DataCon workers and wrappers, we now have
`cprTransformDataConWork` for workers and treat wrappers by analysing their
unfolding. As a result, the code from GHC.Types.Id.Make went away completely.
5. I deactivated worker/wrappering for recursive DataCons and wrote a function
`isRecDataCon` to detect them. We really don't want to give `repeat` or
`replicate` the Nested CPR property.
See Note [CPR for recursive data structures] for which kind of recursive
DataCons we target.
6. Fix a couple of tests and their outputs.
I also documented that CPR can destroy sharing and lead to asymptotic increase
in allocations (which is tracked by #13331/#19326) in
`Note [CPR for data structures can destroy sharing]`.
Nofib results:
```
--------------------------------------------------------------------------------
Program Allocs Instrs
--------------------------------------------------------------------------------
ben-raytrace -3.1% -0.4%
binary-trees +0.8% -2.9%
digits-of-e2 +5.8% +1.2%
event +0.8% -2.1%
fannkuch-redux +0.0% -1.4%
fish 0.0% -1.5%
gamteb -1.4% -0.3%
mkhprog +1.4% +0.8%
multiplier +0.0% -1.9%
pic -0.6% -0.1%
reptile -20.9% -17.8%
wave4main +4.8% +0.4%
x2n1 -100.0% -7.6%
--------------------------------------------------------------------------------
Min -95.0% -17.8%
Max +5.8% +1.2%
Geometric Mean -2.9% -0.4%
```
The huge wins in x2n1 (loopy list) and reptile (see #19970) are due to
refraining from unboxing (:). Other benchmarks like digits-of-e2 or wave4main
regress because of that. Ultimately there are no great improvements due to
Nested CPR alone, but at least it's a win.
Binary sizes decrease by 0.6%.
There are a significant number of metric decreases. The most notable ones (>1%):
```
ManyAlternatives(normal) ghc/alloc 771656002.7 762187472.0 -1.2%
ManyConstructors(normal) ghc/alloc 4191073418.7 4114369216.0 -1.8%
MultiLayerModules(normal) ghc/alloc 3095678333.3 3128720704.0 +1.1%
PmSeriesG(normal) ghc/alloc 50096429.3 51495664.0 +2.8%
PmSeriesS(normal) ghc/alloc 63512989.3 64681600.0 +1.8%
PmSeriesV(normal) ghc/alloc 62575424.0 63767208.0 +1.9%
T10547(normal) ghc/alloc 29347469.3 29944240.0 +2.0%
T11303b(normal) ghc/alloc 46018752.0 47367576.0 +2.9%
T12150(optasm) ghc/alloc 81660890.7 82547696.0 +1.1%
T12234(optasm) ghc/alloc 59451253.3 60357952.0 +1.5%
T12545(normal) ghc/alloc 1705216250.7 1751278952.0 +2.7%
T12707(normal) ghc/alloc 981000472.0 968489800.0 -1.3% GOOD
T13056(optasm) ghc/alloc 389322664.0 372495160.0 -4.3% GOOD
T13253(normal) ghc/alloc 337174229.3 341954576.0 +1.4%
T13701(normal) ghc/alloc 2381455173.3 2439790328.0 +2.4% BAD
T14052(ghci) ghc/alloc 2162530642.7 2139108784.0 -1.1%
T14683(normal) ghc/alloc 3049744728.0 2977535064.0 -2.4% GOOD
T14697(normal) ghc/alloc 362980213.3 369304512.0 +1.7%
T15164(normal) ghc/alloc 1323102752.0 1307480600.0 -1.2%
T15304(normal) ghc/alloc 1304607429.3 1291024568.0 -1.0%
T16190(normal) ghc/alloc 281450410.7 284878048.0 +1.2%
T16577(normal) ghc/alloc 7984960789.3 7811668768.0 -2.2% GOOD
T17516(normal) ghc/alloc 1171051192.0 1153649664.0 -1.5%
T17836(normal) ghc/alloc 1115569746.7 1098197592.0 -1.6%
T17836b(normal) ghc/alloc 54322597.3 55518216.0 +2.2%
T17977(normal) ghc/alloc 47071754.7 48403408.0 +2.8%
T17977b(normal) ghc/alloc 42579133.3 43977392.0 +3.3%
T18923(normal) ghc/alloc 71764237.3 72566240.0 +1.1%
T1969(normal) ghc/alloc 784821002.7 773971776.0 -1.4% GOOD
T3294(normal) ghc/alloc 1634913973.3 1614323584.0 -1.3% GOOD
T4801(normal) ghc/alloc 295619648.0 292776440.0 -1.0%
T5321FD(normal) ghc/alloc 278827858.7 276067280.0 -1.0%
T5631(normal) ghc/alloc 586618202.7 577579960.0 -1.5%
T5642(normal) ghc/alloc 494923048.0 487927208.0 -1.4%
T5837(normal) ghc/alloc 37758061.3 39261608.0 +4.0%
T9020(optasm) ghc/alloc 257362077.3 254672416.0 -1.0%
T9198(normal) ghc/alloc 49313365.3 50603936.0 +2.6% BAD
T9233(normal) ghc/alloc 704944258.7 685692712.0 -2.7% GOOD
T9630(normal) ghc/alloc 1476621560.0 1455192784.0 -1.5%
T9675(optasm) ghc/alloc 443183173.3 433859696.0 -2.1% GOOD
T9872a(normal) ghc/alloc 1720926653.3 1693190072.0 -1.6% GOOD
T9872b(normal) ghc/alloc 2185618061.3 2162277568.0 -1.1% GOOD
T9872c(normal) ghc/alloc 1765842405.3 1733618088.0 -1.8% GOOD
TcPlugin_RewritePerf(normal) ghc/alloc 2388882730.7 2365504696.0 -1.0%
WWRec(normal) ghc/alloc 607073186.7 597512216.0 -1.6%
T9203(normal) run/alloc 107284064.0 102881832.0 -4.1%
haddock.Cabal(normal) run/alloc 24025329589.3 23768382560.0 -1.1%
haddock.base(normal) run/alloc 25660521653.3 25370321824.0 -1.1%
haddock.compiler(normal) run/alloc 74064171706.7 73358712280.0 -1.0%
```
The biggest exception to the rule is T13701 which seems to fluctuate as usual
(not unlike T12545). T14697 has a similar quality, being a generated
multi-module test. T5837 is small enough that it similarly doesn't measure
anything significant besides module loading overhead.
T13253 simply does one additional round of Simplification due to Nested CPR.
There are also some apparent regressions in T9198, T12234 and PmSeriesG that we
(@mpickering and I) were simply unable to reproduce locally. @mpickering tried
to run the CI script in a local Docker container and actually found that T9198
and PmSeriesG *improved*. In MRs that were rebased on top this one, like !4229,
I did not experience such increases. Let's not get hung up on these regression
tests, they were meant to test for asymptotic regressions.
The build-cabal test improves by 1.2% in -O0.
Metric Increase:
T10421
T12234
T12545
T13035
T13056
T13701
T14697
T18923
T5837
T9198
Metric Decrease:
ManyConstructors
T12545
T12707
T13056
T14683
T16577
T18223
T1969
T3294
T9203
T9233
T9675
T9872a
T9872b
T9872c
T9961
TcPlugin_RewritePerf
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While investigating #20106, I made a few refactorings to the pattern-match
checker that I don't want to lose. Here are the changes:
* Some key functions of the checker now have SCC annotations
* Better `-ddump-ec-trace` diagnostics for easier debugging. I added
'traceWhenFailPm' to see *why* a particular `MaybeT` computation fails and
made use of it in `instCon`.
I also increased the acceptance threshold of T11545, which seems to fail
randomly lately due to ghc/max flukes.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch, provoked by regressions in the text package
(#19557), improves sharing of join points. This also fixes
the terrible behaviour in #20049.
See Note [Duplicating join points] in GHC.Core.Opt.Simplify.
* In the StrictArg case of mkDupableContWithDmds, don't
use Plan A for data constructors
* In postInlineUnconditionally, don't inline JoinIds
Avoids inlining join $j x = Just x
in case blah of
A -> $j x1
B -> $j x2
C -> $j x3
* In mkDupableStrictBind and mkDupableStrictAlt, create
join points (much) more often: exprIsTrivial rather than
exprIsDupable. This may be much, but we'll see.
Metric Decrease:
T12545
T13253-spj
T13719
T18140
T18282
T18304
T18698a
T18698b
Metric Increase:
T16577
T18923
T9961
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In a sequel of #19414, I wrote a script that measures min and max allocation
bounds of T12545 based on randomly modifying -dunique-increment. I got a spread
of as much as 4.8%. But instead of widening the acceptance window further (to
5%), I committed the script as part of this commit, so that false positive
increases can easily be diagnosed by comparing min and max bounds to HEAD.
Indeed, for !5814 we have seen T12545 go from -0.3% to 3.3% after a rebase.
I made sure that the min and max bounds actually stayed the same.
In the future, this kind of check can very easily be done in a matter of a
minute. Maybe we should increase the acceptance threshold if we need to check
often (leave a comment on #19414 if you had to check), but I've not been bitten
by it for half a year, which seems OK.
Metric Increase:
T12545
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch comprises of four different but closely related ideas. The
net result is fixing a large number of open issues with the driver
whilst making it simpler to understand.
1. Use the hash of the source file to determine whether the source file
has changed or not. This makes the recompilation checking more robust to
modern build systems which are liable to copy files around changing
their modification times.
2. Remove the concept of a "stable module", a stable module was one
where the object file was older than the source file, and all transitive
dependencies were also stable. Now we don't rely on the modification
time of the source file, the notion of stability is moot.
3. Fix TH/plugin recompilation after the removal of stable modules. The
TH recompilation check used to rely on stable modules. Now there is a
uniform and simple way, we directly track the linkables which were
loaded into the interpreter whilst compiling a module. This is an
over-approximation but more robust wrt package dependencies changing.
4. Fix recompilation checking for dynamic object files. Now we actually
check if the dynamic object file exists when compiling with -dynamic-too
Fixes #19774 #19771 #19758 #17434 #11556 #9121 #8211 #16495 #7277 #16093
|
|
|
|
|
| |
This makes it more robust to people running it with `quick` flavour and
so on.
|
|
|
|
|
| |
The test max memory usage improves dramatically with the fixes to
memory usage in demand analyser from #15455
|
|
|
|
| |
Fixes #11545
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
| |
As #19293 realises, this one keeps on flip flopping by 2.5%
depending on how many modules there are within the GHC package.
We should revert this once we figured out how to fix what's going on.
|
|
|
|
|
|
|
|
|
| |
This test flip-flops by +-1% in arbitrary changes in CI.
While playing around with `-dunique-increment`, I could reproduce
variations of 3% in compiler allocations, so I set the acceptance window
accordingly.
Fixes #19414.
|
|
|
|
| |
This reverts commit 4a9d856d21c67b3328e26aa68a071ec9a824a7bb.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit also consolidates documentation in the user
manual around UndecidableSuperClasses, UndecidableInstances,
and FlexibleContexts.
Close #19186.
Close #19187.
Test case: typecheck/should_compile/T19186,
typecheck/should_fail/T19187{,a}
|
|
|
|
|
|
|
| |
Instead of producing auxiliary con2tag bindings we now rely on
dataToTag#, eliminating a fair bit of generated code.
Co-Authored-By: Ben Gamari <ben@well-typed.com>
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As outlined in #18903, interleaving usage and strictness demands not
only means a more compact demand representation, but also allows us to
express demands that we weren't easily able to express before.
Call demands are *relative* in the sense that a call demand `Cn(cd)`
on `g` says "`g` is called `n` times. *Whenever `g` is called*, the
result is used according to `cd`". Example from #18903:
```hs
h :: Int -> Int
h m =
let g :: Int -> (Int,Int)
g 1 = (m, 0)
g n = (2 * n, 2 `div` n)
{-# NOINLINE g #-}
in case m of
1 -> 0
2 -> snd (g m)
_ -> uncurry (+) (g m)
```
Without the interleaved representation, we would just get `L` for the
strictness demand on `g`. Now we are able to express that whenever
`g` is called, its second component is used strictly in denoting `g`
by `1C1(P(1P(U),SP(U)))`. This would allow Nested CPR to unbox the
division, for example.
Fixes #18903.
While fixing regressions, I also discovered and fixed #18957.
Metric Decrease:
T13253-spj
|
|\ |
|
| |
| |
| |
| | |
These all have a maximum residency of over 2 GB.
|
| | |
|
|/
|
|
|
|
|
| |
Progress towards #18842. As @sgraf812 points out, widening the window is
dangerous until the exponential described in #17658 is fixed. But this
test has caused enough misery and is low stakes enough that we and
@bgamari think it's worth it in this one case for the time being.
|
|
|
|
|
|
|
|
| |
Makes it possible for GHC to optimize away intermediate Generic representation
for more types.
Metric Increase:
T12227
|
|
|
|
|
|
|
|
|
| |
-------------------------
Metric Decrease:
T12425
Metric Increase:
T17516
-------------------------
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes #18223, which made GHC generate an exponential
amount of code. There are three quite separate changes in here
1. Re-engineer eta-expansion (again). The eta-expander was
generating lots of intermediate stuff, which could be optimised
away, but which choked the simplifier meanwhile. Relatively
easy to kill it off at source.
See Note [The EtaInfo mechanism] in GHC.Core.Opt.Arity.
The main new thing is the use of pushCoArg in getArg_maybe.
2. Stop Specialise specalising DFuns. This is the cause of a huge
(and utterly unnecessary) blowup in program size in #18223.
See Note [Do not specialise DFuns] in GHC.Core.Opt.Specialise.
I also refactored the Specialise monad a bit... it was silly,
because it passed on unchanging values as if they were mutable
state.
3. Do an extra Simplifer run, after SpecConstra and before
late-Specialise. I found (investigating perf/compiler/T16473)
that failing to do this was crippling *both* SpecConstr *and*
Specialise. See Note [Simplify after SpecConstr] in
GHC.Core.Opt.Pipeline.
This change does mean an extra run of the Simplifier, but only
with -O2, and I think that's acceptable.
T16473 allocates *three* times less with this change. (I changed
it to check runtime rather than compile time.)
Some smaller consequences
* I moved pushCoercion, pushCoArg and friends from SimpleOpt
to Arity, because it was needed by the new etaInfoApp.
And pushCoValArg now returns a MCoercion rather than Coercion for
the argument Coercion.
* A minor, incidental improvement to Core pretty-printing
This does fix #18223, (which was otherwise uncompilable. Hooray. But
there is still a big intermediate because there are some very deeply
nested types in that program.
Modest reductions in compile-time allocation on a couple of benchmarks
T12425 -2.0%
T13253 -10.3%
Metric increase with -O2, due to extra simplifier run
T9233 +5.8%
T12227 +1.8%
T15630 +5.0%
There is a spurious apparent increase on heap residency on T9630,
on some architectures at least. I tried it with -G1 and the residency
is essentially unchanged.
Metric Increase
T9233
T12227
T9630
Metric Decrease
T12425
T13253
|
|
|
|
|
|
|
| |
Previously it collected everything, including "max bytes used". This is
problematic since the test makes no attempt to control for deviations in
GC timing, resulting in high variability. Fix this by only collecting
"bytes allocated".
|