summaryrefslogtreecommitdiff
path: root/compiler/GHC/Data
Commit message (Collapse)AuthorAgeFilesLines
* Indent closing "#-}" to silence HLintwip/T21623Simon Peyton Jones2022-11-111-5/+5
|
* Type vs Constraint: finally nailedSimon Peyton Jones2022-11-111-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This big patch addresses the rats-nest of issues that have plagued us for years, about the relationship between Type and Constraint. See #11715/#21623. The main payload of the patch is: * To introduce CONSTRAINT :: RuntimeRep -> Type * To make TYPE and CONSTRAINT distinct throughout the compiler Two overview Notes in GHC.Builtin.Types.Prim * Note [TYPE and CONSTRAINT] * Note [Type and Constraint are not apart] This is the main complication. The specifics * New primitive types (GHC.Builtin.Types.Prim) - CONSTRAINT - ctArrowTyCon (=>) - tcArrowTyCon (-=>) - ccArrowTyCon (==>) - funTyCon FUN -- Not new See Note [Function type constructors and FunTy] and Note [TYPE and CONSTRAINT] * GHC.Builtin.Types: - New type Constraint = CONSTRAINT LiftedRep - I also stopped nonEmptyTyCon being built-in; it only needs to be wired-in * Exploit the fact that Type and Constraint are distinct throughout GHC - Get rid of tcView in favour of coreView. - Many tcXX functions become XX functions. e.g. tcGetCastedTyVar --> getCastedTyVar * Kill off Note [ForAllTy and typechecker equality], in (old) GHC.Tc.Solver.Canonical. It said that typechecker-equality should ignore the specified/inferred distinction when comparein two ForAllTys. But that wsa only weakly supported and (worse) implies that we need a separate typechecker equality, different from core equality. No no no. * GHC.Core.TyCon: kill off FunTyCon in data TyCon. There was no need for it, and anyway now we have four of them! * GHC.Core.TyCo.Rep: add two FunTyFlags to FunCo See Note [FunCo] in that module. * GHC.Core.Type. Lots and lots of changes driven by adding CONSTRAINT. The key new function is sORTKind_maybe; most other changes are built on top of that. See also `funTyConAppTy_maybe` and `tyConAppFun_maybe`. * Fix a longstanding bug in GHC.Core.Type.typeKind, and Core Lint, in kinding ForAllTys. See new tules (FORALL1) and (FORALL2) in GHC.Core.Type. (The bug was that before (forall (cv::t1 ~# t2). blah), where blah::TYPE IntRep, would get kind (TYPE IntRep), but it should be (TYPE LiftedRep). See Note [Kinding rules for types] in GHC.Core.Type. * GHC.Core.TyCo.Compare is a new module in which we do eqType and cmpType. Of course, no tcEqType any more. * GHC.Core.TyCo.FVs. I moved some free-var-like function into this module: tyConsOfType, visVarsOfType, and occCheckExpand. Refactoring only. * GHC.Builtin.Types. Compiletely re-engineer boxingDataCon_maybe to have one for each /RuntimeRep/, rather than one for each /Type/. This dramatically widens the range of types we can auto-box. See Note [Boxing constructors] in GHC.Builtin.Types The boxing types themselves are declared in library ghc-prim:GHC.Types. GHC.Core.Make. Re-engineer the treatment of "big" tuples (mkBigCoreVarTup etc) GHC.Core.Make, so that it auto-boxes unboxed values and (crucially) types of kind Constraint. That allows the desugaring for arrows to work; it gathers up free variables (including dictionaries) into tuples. See Note [Big tuples] in GHC.Core.Make. There is still work to do here: #22336. But things are better than before. * GHC.Core.Make. We need two absent-error Ids, aBSENT_ERROR_ID for types of kind Type, and aBSENT_CONSTRAINT_ERROR_ID for vaues of kind Constraint. Ditto noInlineId vs noInlieConstraintId in GHC.Types.Id.Make; see Note [inlineId magic]. * GHC.Core.TyCo.Rep. Completely refactor the NthCo coercion. It is now called SelCo, and its fields are much more descriptive than the single Int we used to have. A great improvement. See Note [SelCo] in GHC.Core.TyCo.Rep. * GHC.Core.RoughMap.roughMatchTyConName. Collapse TYPE and CONSTRAINT to a single TyCon, so that the rough-map does not distinguish them. * GHC.Core.DataCon - Mainly just improve documentation * Some significant renamings: GHC.Core.Multiplicity: Many --> ManyTy (easier to grep for) One --> OneTy GHC.Core.TyCo.Rep TyCoBinder --> GHC.Core.Var.PiTyBinder GHC.Core.Var TyCoVarBinder --> ForAllTyBinder AnonArgFlag --> FunTyFlag ArgFlag --> ForAllTyFlag GHC.Core.TyCon TyConTyCoBinder --> TyConPiTyBinder Many functions are renamed in consequence e.g. isinvisibleArgFlag becomes isInvisibleForAllTyFlag, etc * I refactored FunTyFlag (was AnonArgFlag) into a simple, flat data type data FunTyFlag = FTF_T_T -- (->) Type -> Type | FTF_T_C -- (-=>) Type -> Constraint | FTF_C_T -- (=>) Constraint -> Type | FTF_C_C -- (==>) Constraint -> Constraint * GHC.Tc.Errors.Ppr. Some significant refactoring in the TypeEqMisMatch case of pprMismatchMsg. * I made the tyConUnique field of TyCon strict, because I saw code with lots of silly eval's. That revealed that GHC.Settings.Constants.mAX_SUM_SIZE can only be 63, because we pack the sum tag into a 6-bit field. (Lurking bug squashed.) Fixes * #21530 Updates haddock submodule slightly. Performance changes ~~~~~~~~~~~~~~~~~~~ I was worried that compile times would get worse, but after some careful profiling we are down to a geometric mean 0.1% increase in allocation (in perf/compiler). That seems fine. There is a big runtime improvement in T10359 Metric Decrease: LargeRecord MultiLayerModulesTH_OneShot T13386 T13719 Metric Increase: T8095
* add new modules for reducibility and WebAssembly translationNorman Ramsey2022-11-111-0/+264
|
* add the two key graph modules from Martin Erwig's FGLNorman Ramsey2022-11-113-0/+1020
| | | | | | | | | | | | | | | | | | | | | | | | | | Martin Erwig's FGL (Functional Graph Library) provides an "inductive" representation of graphs. A general graph has labeled nodes and labeled edges. The key operation on a graph is to decompose it by removing one node, together with the edges that connect the node to the rest of the graph. There is also an inverse composition operation. The decomposition and composition operations make this representation of graphs exceptionally well suited to implement graph algorithms in which the graph is continually changing, as alluded to in #21259. This commit adds `GHC.Data.Graph.Inductive.Graph`, which defines the interface, and `GHC.Data.Graph.Inductive.PatriciaTree`, which provides an implementation. Both modules are taken from `fgl-5.7.0.3` on Hackage, with these changes: - Copyright and license text have been copied into the files themselves, not stored separately. - Some calls to `error` have been replaced with calls to `panic`. - Conditional-compilation support for older versions of GHC, `containers`, and `base` has been removed.
* Define `Infinite` list and use where appropriate.M Farkas-Dyck2022-11-081-0/+194
| | | | | | | | Also add perf test for infinite list fusion. In particular, in `GHC.Core`, often we deal with infinite lists of roles. Also in a few locations we deal with infinite lists of names. Thanks to simonpj for helping to write the Note [Fusion for `Infinite` lists].
* Export pprTrace and friends from GHC.Prelude.Andreas Klebinger2022-11-034-4/+3
| | | | | Introduces GHC.Prelude.Basic which can be used in modules which are a dependency of the ppr code.
* Remove source location information from interface filesOwen Shepherd2022-10-271-8/+8
| | | | | | | | | | | | | This change aims to minimize source location information leaking into interface files, which makes ABI hashes dependent on the build location. The `Binary (Located a)` instance has been removed completely. It seems that the HIE interface still needs the ability to serialize SrcSpans, but by wrapping the instances, it should be a lot more difficult to inadvertently add source location information.
* Cleanup String/FastString conversionsKrzysztof Gogolewski2022-10-251-32/+2
| | | | | | | Remove unused mkPtrString and isUnderscoreFS. We no longer use mkPtrString since 1d03d8bef96. Remove unnecessary conversions between FastString and String and back.
* Enforce invariant of `ListBag` constructor.M Farkas-Dyck2022-10-191-16/+26
|
* Make `Functor` a superclass of `TrieMap`, which lets us derive the `map` ↵M Farkas-Dyck2022-10-181-19/+38
| | | | functions.
* Enforce internal invariant of OrdList and fix bugs in viewCons / viewSnocBodigrim2022-10-011-13/+23
| | | | | `viewCons` used to ignore `Many` constructor completely, returning `VNothing`. `viewSnoc` violated internal invariant of `Many` being a non-empty list.
* Scrub various partiality involving empty lists.M Farkas-Dyck2022-09-301-2/+3
| | | | Avoids some uses of `head` and `tail`, and some panics when an argument is null.
* Eliminate headFS, use unconsFS insteadBodigrim2022-09-281-6/+0
| | | | | A small step towards #22185 to avoid partial functions + safe implementation of `startsWithUnderscore`.
* Apply some tricks to speed up core lint.Andreas Klebinger2022-09-281-0/+50
| | | | | | | | | | | | | | | | | | | | | | | | | Below are the noteworthy changes and if given their impact on compiler allocations for a type heavy module: * Use the oneShot trick on LintM * Use a unboxed tuple for the result of LintM: ~6% reduction * Avoid a thunk for the result of typeKind in lintType: ~5% reduction * lint_app: Don't allocate the error msg in the hot code path: ~4% reduction * lint_app: Eagerly force the in scope set: ~4% * nonDetCmpType: Try to short cut using reallyUnsafePtrEquality#: ~2% * lintM: Use a unboxed maybe for the `a` result: ~12% * lint_app: make go_app tail recursive to avoid allocating the go function as heap closure: ~7% * expandSynTyCon_maybe: Use a specialized data type For a less type heavy module like nofib/spectral/simple compiled with -O -dcore-lint allocations went down by ~24% and compile time by ~9%. ------------------------- Metric Decrease: T1969 -------------------------
* Clean up some. In particular:M Farkas-Dyck2022-09-174-39/+8
| | | | | | | | | | • Delete some dead code, largely under `GHC.Utils`. • Clean up a few definitions in `GHC.Utils.(Misc, Monad)`. • Clean up `GHC.Types.SrcLoc`. • Derive stock `Functor, Foldable, Traversable` for more types. • Derive more instances for newtypes. Bump haddock submodule.
* Fix typosEric Lindblad2022-09-144-6/+6
| | | | | | | This fixes various typos and spelling mistakes in the compiler. Fixes #21891
* ghc-boot: Clean up UTF-8 codecsBen Gamari2022-07-222-8/+8
| | | | | | | | In preparation for moving the UTF-8 codecs into `base`: * Move them to GHC.Utils.Encoding.UTF8 * Make names more consistent * Add some Haddocks
* CorePrep: Don't speculatively evaluate recursive calls (#20836)Sebastian Graf2022-06-201-1/+7
| | | | | | | | | | | In #20836 we have optimised a terminating program into an endless loop, because we speculated the self-recursive call of a recursive DFun. Now we track the set of enclosing recursive binders in CorePrep to prevent speculation of such self-recursive calls. See the updates to Note [Speculative evaluation] for details. Fixes #20836.
* Use UnionListsOrd instead of UnionLists in most places.Andreas Klebinger2022-05-241-2/+15
| | | | This should get rid of most, if not all "Overlong lists" errors and fix #20016
* Fix all invalid haddock comments in the compilerZubin Duggal2022-03-291-6/+6
| | | | Fixes #20935 and #20924
* hi haddock: Lex and store haddock docs in interface filesZubin Duggal2022-03-232-0/+45
| | | | | | | | | | | | | | | | | | Names appearing in Haddock docstrings are lexed and renamed like any other names appearing in the AST. We currently rename names irrespective of the namespace, so both type and constructor names corresponding to an identifier will appear in the docstring. Haddock will select a given name as the link destination based on its own heuristics. This patch also restricts the limitation of `-haddock` being incompatible with `Opt_KeepRawTokenStream`. The export and documenation structure is now computed in GHC and serialised in .hi files. This can be used by haddock to directly generate doc pages without reparsing or renaming the source. At the moment the operation of haddock is not modified, that's left to a future patch. Updates the haddock submodule with the minimum changes needed.
* Make inert_cycle_breakers into a stack.Richard Eisenberg2022-03-021-1/+4
| | | | Close #20231.
* Introduce ConcreteTv metavariablessheaf2022-03-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch introduces a new kind of metavariable, by adding the constructor `ConcreteTv` to `MetaInfo`. A metavariable with `ConcreteTv` `MetaInfo`, henceforth a concrete metavariable, can only be unified with a type that is concrete (that is, a type that answers `True` to `GHC.Core.Type.isConcrete`). This solves the problem of dangling metavariables in `Concrete#` constraints: instead of emitting `Concrete# ty`, which contains a secret existential metavariable, we simply emit a primitive equality constraint `ty ~# concrete_tv` where `concrete_tv` is a fresh concrete metavariable. This means we can avoid all the complexity of canonicalising `Concrete#` constraints, as we can just re-use the existing machinery for `~#`. To finish things up, this patch then removes the `Concrete#` special predicate, and instead introduces the special predicate `IsRefl#` which enforces that a coercion is reflexive. Such a constraint is needed because the canonicaliser is quite happy to rewrite an equality constraint such as `ty ~# concrete_tv`, but such a rewriting is not handled by the rest of the compiler currently, as we need to make use of the resulting coercion, as outlined in the FixedRuntimeRep plan. The big upside of this approach (on top of simplifying the code) is that we can now selectively implement PHASE 2 of FixedRuntimeRep, by changing individual calls of `hasFixedRuntimeRep_MustBeRefl` to `hasFixedRuntimeRep` and making use of the obtained coercion.
* Derive some stock instances for OverridingBoolsheaf2022-02-251-3/+10
| | | | | | | | | This patch adds some derived instances to `GHC.Data.Bool.OverridingBool`. It also changes the order of the constructors, so that the derived `Ord` instance matches the behaviour for `Maybe Bool`. Fixes #20326
* Suggestions due to hlintMatthew Pickering2022-02-241-2/+1
| | | | | It turns out this job hasn't been running for quite a while (perhaps ever) so there are quite a few failures when running the linter locally.
* compiler: Introduce and use RoughMap for instance environmentsBen Gamari2022-02-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here we introduce a new data structure, RoughMap, inspired by the previous `RoughTc` matching mechanism for checking instance matches. This allows [Fam]InstEnv to be implemented as a trie indexed by these RoughTc signatures, reducing the complexity of instance lookup and FamInstEnv merging (done during the family instance conflict test) from O(n) to O(log n). The critical performance improvement currently realised by this patch is in instance matching. In particular the RoughMap mechanism allows us to discount many potential instances which will never match for constraints involving type variables (see Note [Matching a RoughMap]). In realistic code bases matchInstEnv was accounting for 50% of typechecker time due to redundant work checking instances when simplifying instance contexts when deriving instances. With this patch the cost is significantly reduced. The larger constants in InstEnv creation do mean that a few small tests regress in allocations slightly. However, the runtime of T19703 is reduced by a factor of 4. Moreover, the compilation time of the Cabal library is slightly improved. A couple of test cases are included which demonstrate significant improvements in compile time with this patch. This unfortunately does not fix the testcase provided in #19703 but does fix #20933 ------------------------- Metric Decrease: T12425 Metric Increase: T13719 T9872a T9872d hard_hole_fits ------------------------- Co-authored-by: Matthew Pickering <matthewtpickering@gmail.com>
* Fix a few Note inconsistenciesBen Gamari2022-02-011-1/+1
|
* Rework the handling of SkolemInfoMatthew Pickering2022-01-291-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main purpose of this patch is to attach a SkolemInfo directly to each SkolemTv. This fixes the large number of bugs which have accumulated over the years where we failed to report errors due to having "no skolem info" for particular type variables. Now the origin of each type varible is stored on the type variable we can always report accurately where it cames from. Fixes #20969 #20732 #20680 #19482 #20232 #19752 #10946 #19760 #20063 #13499 #14040 The main changes of this patch are: * SkolemTv now contains a SkolemInfo field which tells us how the SkolemTv was created. Used when reporting errors. * Enforce invariants relating the SkolemInfoAnon and level of an implication (ic_info, ic_tclvl) to the SkolemInfo and level of the type variables in ic_skols. * All ic_skols are TcTyVars -- Check is currently disabled * All ic_skols are SkolemTv * The tv_lvl of the ic_skols agrees with the ic_tclvl * The ic_info agrees with the SkolInfo of the implication. These invariants are checked by a debug compiler by checkImplicationInvariants. * Completely refactor kcCheckDeclHeader_sig which kept doing my head in. Plus, it wasn't right because it wasn't skolemising the binders as it decomposed the kind signature. The new story is described in Note [kcCheckDeclHeader_sig]. The code is considerably shorter than before (roughly 240 lines turns into 150 lines). It still has the same awkward complexity around computing arity as before, but that is a language design issue. See Note [Arity inference in kcCheckDeclHeader_sig] * I added new type synonyms MonoTcTyCon and PolyTcTyCon, and used them to be clear which TcTyCons have "finished" kinds etc, and which are monomorphic. See Note [TcTyCon, MonoTcTyCon, and PolyTcTyCon] * I renamed etaExpandAlgTyCon to splitTyConKind, becuase that's a better name, and it is very useful in kcCheckDeclHeader_sig, where eta-expansion isn't an issue. * Kill off the nasty `ClassScopedTvEnv` entirely. Co-authored-by: Simon Peyton Jones <simon.peytonjones@gmail.com>
* warnPprTrace: pass separately the reasonKrzysztof Gogolewski2022-01-111-3/+3
| | | | This makes it more similar to pprTrace, pprPanic etc.
* Perf: use SmallArray for primops' Ids cache (#20857)Sylvain Henry2022-01-061-0/+92
| | | | | | SmallArray doesn't perform bounds check (faster). Make primop tags start at 0 to avoid index arithmetic.
* Correct retypechecking in --make modeMatthew Pickering2021-11-251-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Note [Hydrating Modules] ~~~~~~~~~~~~~~~~~~~~~~~~ What is hydrating a module? * There are two versions of a module, the ModIface is the on-disk version and the ModDetails is a fleshed-out in-memory version. * We can **hydrate** a ModIface in order to obtain a ModDetails. Hydration happens in three different places * When an interface file is initially loaded from disk, it has to be hydrated. * When a module is finished compiling, we hydrate the ModIface in order to generate the version of ModDetails which exists in memory (see Note) * When dealing with boot files and module loops (see Note [Rehydrating Modules]) Note [Rehydrating Modules] ~~~~~~~~~~~~~~~~~~~~~~~~~~~ If a module has a boot file then it is critical to rehydrate the modules on the path between the two. Suppose we have ("R" for "recursive"): ``` R.hs-boot: module R where data T g :: T -> T A.hs: module A( f, T, g ) where import {-# SOURCE #-} R data S = MkS T f :: T -> S = ...g... R.hs: module R where data T = T1 | T2 S g = ...f... ``` After compiling A.hs we'll have a TypeEnv in which the Id for `f` has a type type uses the AbstractTyCon T; and a TyCon for `S` that also mentions that same AbstractTyCon. (Abstract because it came from R.hs-boot; we know nothing about it.) When compiling R.hs, we build a TyCon for `T`. But that TyCon mentions `S`, and it currently has an AbstractTyCon for `T` inside it. But we want to build a fully cyclic structure, in which `S` refers to `T` and `T` refers to `S`. Solution: **rehydration**. *Before compiling `R.hs`*, rehydrate all the ModIfaces below it that depend on R.hs-boot. To rehydrate a ModIface, call `typecheckIface` to convert it to a ModDetails. It's just a de-serialisation step, no type inference, just lookups. Now `S` will be bound to a thunk that, when forced, will "see" the final binding for `T`; see [Tying the knot](https://gitlab.haskell.org/ghc/ghc/-/wikis/commentary/compiler/tying-the-knot). But note that this must be done *before* compiling R.hs. When compiling R.hs, the knot-tying stuff above will ensure that `f`'s unfolding mentions the `LocalId` for `g`. But when we finish R, we carefully ensure that all those `LocalIds` are turned into completed `GlobalIds`, replete with unfoldings etc. Alas, that will not apply to the occurrences of `g` in `f`'s unfolding. And if we leave matters like that, they will stay that way, and *all* subsequent modules that import A will see a crippled unfolding for `f`. Solution: rehydrate both R and A's ModIface together, right after completing R.hs. We only need rehydrate modules that are * Below R.hs * Above R.hs-boot There might be many unrelated modules (in the home package) that don't need to be rehydrated. This dark corner is the subject of #14092. Suppose we add to our example ``` X.hs module X where import A data XT = MkX T fx = ...g... ``` If in `--make` we compile R.hs-boot, then A.hs, then X.hs, we'll get a `ModDetails` for `X` that has an AbstractTyCon for `T` in the the argument type of `MkX`. So: * Either we should delay compiling X until after R has beeen compiled. * Or we should rehydrate X after compiling R -- because it transitively depends on R.hs-boot. Ticket #20200 has exposed some issues to do with the knot-tying logic in GHC.Make, in `--make` mode. this particular issue starts [here](https://gitlab.haskell.org/ghc/ghc/-/issues/20200#note_385758). The wiki page [Tying the knot](https://gitlab.haskell.org/ghc/ghc/-/wikis/commentary/compiler/tying-the-knot) is helpful. Also closely related are * #14092 * #14103 Fixes tickets #20200 #20561
* Turn GHC.Data.Graph.Base.Graph into a newtypeSimon Jakobi2021-11-121-1/+1
|
* driver: Cache the transitive dependency calculation in ModuleGraphMatthew Pickering2021-11-111-2/+3
| | | | | | | | Two reasons for this change: 1. Avoid computing the transitive dependencies when compiling each module, this can save a lot of repeated work. 2. More robust to forthcoming changes to support multiple home units.
* Avoid GHC_STAGE and other include bitsJohn Ericson2021-11-051-7/+5
| | | | | | | | | We should strive to make our includes in terms of the RTS as much as possible. One place there that is not possible, the llvm version, we make a new tiny header Stage numbers are somewhat arbitrary, if we simple need a newer RTS, we should say so.
* Warn if unicode bidirectional formatting characters are found in the source ↵Zubin Duggal2021-10-261-1/+57
| | | | (#20263)
* Make fields of GlobalRdrElt strictMatthew Pickering2021-10-201-2/+8
| | | | | | | | In order to do this I thought it was prudent to change the list type to a bag type to avoid doing a lot of premature work in plusGRE because of ++. Fixes #19201
* Introduce Concrete# for representation polymorphism checkssheaf2021-10-171-1/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | PHASE 1: we never rewrite Concrete# evidence. This patch migrates all the representation polymorphism checks to the typechecker, using a new constraint form Concrete# :: forall k. k -> TupleRep '[] Whenever a type `ty` must be representation-polymorphic (e.g. it is the type of an argument to a function), we emit a new `Concrete# ty` Wanted constraint. If this constraint goes unsolved, we report a representation-polymorphism error to the user. The 'FRROrigin' datatype keeps track of the context of the representation-polymorphism check, for more informative error messages. This paves the way for further improvements, such as allowing type families in RuntimeReps and improving the soundness of typed Template Haskell. This is left as future work (PHASE 2). fixes #17907 #20277 #20330 #20423 #20426 updates haddock submodule ------------------------- Metric Decrease: T5642 -------------------------
* Don't use FastString for UTF-8 encoding onlySylvain Henry2021-10-021-2/+10
|
* Code Gen: Use more efficient block merging algorithmMatthew Pickering2021-09-171-0/+91
| | | | | | | | | | | | | | | | | | The previous algorithm scaled poorly when there was a large number of blocks and edges. The algorithm links together block chains which have edges between them in the CFG. The new algorithm uses a union find data structure in order to efficiently merge together blocks and calculate which block chain each block id belonds to. I copied the UnionFind data structure which already existed in Cabal into the GHC library rathert than reimplement it myself. This change results in a very significant reduction in allocations when compiling the mmark package. Ticket: #19471
* Driver rework pt3: the upsweepMatthew Pickering2021-08-181-2/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch specifies and simplifies the module cycle compilation in upsweep. How things work are described in the Note [Upsweep] Note [Upsweep] ~~~~~~~~~~~~~~ Upsweep takes a 'ModuleGraph' as input, computes a build plan and then executes the plan in order to compile the project. The first step is computing the build plan from a 'ModuleGraph'. The output of this step is a `[BuildPlan]`, which is a topologically sorted plan for how to build all the modules. ``` data BuildPlan = SingleModule ModuleGraphNode -- A simple, single module all alone but *might* have an hs-boot file which isn't part of a cycle | ResolvedCycle [ModuleGraphNode] -- A resolved cycle, linearised by hs-boot files | UnresolvedCycle [ModuleGraphNode] -- An actual cycle, which wasn't resolved by hs-boot files ``` The plan is computed in two steps: Step 1: Topologically sort the module graph without hs-boot files. This returns a [SCC ModuleGraphNode] which contains cycles. Step 2: For each cycle, topologically sort the modules in the cycle *with* the relevant hs-boot files. This should result in an acyclic build plan if the hs-boot files are sufficient to resolve the cycle. The `[BuildPlan]` is then interpreted by the `interpretBuildPlan` function. * `SingleModule nodes` are compiled normally by either the upsweep_inst or upsweep_mod functions. * `ResolvedCycles` need to compiled "together" so that the information which ends up in the interface files at the end is accurate (and doesn't contain temporary information from the hs-boot files.) - During the initial compilation, a `KnotVars` is created which stores an IORef TypeEnv for each module of the loop. These IORefs are gradually updated as the loop completes and provide the required laziness to typecheck the module loop. - At the end of typechecking, all the interface files are typechecked again in the retypecheck loop. This time, the knot-tying is done by the normal laziness based tying, so the environment is run without the KnotVars. * UnresolvedCycles are indicative of a proper cycle, unresolved by hs-boot files and are reported as an error to the user. The main trickiness of `interpretBuildPlan` is deciding which version of a dependency is visible from each module. For modules which are not in a cycle, there is just one version of a module, so that is always used. For modules in a cycle, there are two versions of 'HomeModInfo'. 1. Internal to loop: The version created whilst compiling the loop by upsweep_mod. 2. External to loop: The knot-tied version created by typecheckLoop. Whilst compiling a module inside the loop, we need to use the (1). For a module which is outside of the loop which depends on something from in the loop, the (2) version is used. As the plan is interpreted, which version of a HomeModInfo is visible is updated by updating a map held in a state monad. So after a loop has finished being compiled, the visible module is the one created by typecheckLoop and the internal version is not used again. This plan also ensures the most important invariant to do with module loops: > If you depend on anything within a module loop, before you can use the dependency, the whole loop has to finish compiling. The end result of `interpretBuildPlan` is a `[MakeAction]`, which are pairs of `IO a` actions and a `MVar (Maybe a)`, somewhere to put the result of running the action. This list is topologically sorted, so can be run in order to compute the whole graph. As well as this `interpretBuildPlan` also outputs an `IO [Maybe (Maybe HomeModInfo)]` which can be queried at the end to get the result of all modules at the end, with their proper visibility. For example, if any module in a loop fails then all modules in that loop will report as failed because the visible node at the end will be the result of retypechecking those modules together. Along the way we also fix a number of other bugs in the driver: * Unify upsweep and parUpsweep. * Fix #19937 (static points, ghci and -j) * Adds lots of module loop tests due to Divam. Also related to #20030 Co-authored-by: Divam Narula <dfordivam@gmail.com> ------------------------- Metric Decrease: T10370 -------------------------
* Put tracing functions into their own moduleSylvain Henry2021-06-222-2/+53
| | | | | | | | Now that Outputable is independent of DynFlags, we can put tracing functions using SDocs into their own module that doesn't transitively depend on any GHC.Driver.* module. A few modules needed to be moved to avoid loops in DEBUG mode.
* Perf: fix appendFSSylvain Henry2021-06-191-2/+2
| | | | To append 2 FastString we don't need to convert them into ByteString: use ShortByteString's Semigroup instance instead.
* DerivingVia for Hsc instances. GND for NonDetFastString and LexicalFastString.Baldur Blöndal2021-06-162-10/+7
|
* Driver Rework PatchMatthew Pickering2021-06-031-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch comprises of four different but closely related ideas. The net result is fixing a large number of open issues with the driver whilst making it simpler to understand. 1. Use the hash of the source file to determine whether the source file has changed or not. This makes the recompilation checking more robust to modern build systems which are liable to copy files around changing their modification times. 2. Remove the concept of a "stable module", a stable module was one where the object file was older than the source file, and all transitive dependencies were also stable. Now we don't rely on the modification time of the source file, the notion of stability is moot. 3. Fix TH/plugin recompilation after the removal of stable modules. The TH recompilation check used to rely on stable modules. Now there is a uniform and simple way, we directly track the linkables which were loaded into the interpreter whilst compiling a module. This is an over-approximation but more robust wrt package dependencies changing. 4. Fix recompilation checking for dynamic object files. Now we actually check if the dynamic object file exists when compiling with -dynamic-too Fixes #19774 #19771 #19758 #17434 #11556 #9121 #8211 #16495 #7277 #16093
* Introduce Strict.Maybe, Strict.Pair (#19156)Vladislav Zavialov2021-05-231-0/+67
| | | | | | | | | | | | | This patch fixes a space leak related to the use of Maybe in RealSrcSpan by introducing a strict variant of Maybe. In addition to that, it also introduces a strict pair and uses the newly introduced strict data types in a few other places (e.g. the lexer/parser state) to reduce allocations. Includes a regression test.
* Remove useless {-# LANGUAGE CPP #-} pragmasSylvain Henry2021-05-127-9/+7
|
* Fully remove HsVersions.hSylvain Henry2021-05-124-9/+1
| | | | | | | | | | Replace uses of WARN macro with calls to: warnPprTrace :: Bool -> SDoc -> a -> a Remove the now unused HsVersions.h Bump haddock submodule
* Replace CPP assertions with Haskell functionsSylvain Henry2021-05-122-3/+2
| | | | | | | | | | | | | | | There is no reason to use CPP. __LINE__ and __FILE__ macros are now better replaced with GHC's CallStack. As a bonus, assert error messages now contain more information (function name, column). Here is the mapping table (HasCallStack omitted): * ASSERT: assert :: Bool -> a -> a * MASSERT: massert :: Bool -> m () * ASSERTM: assertM :: m Bool -> m () * ASSERT2: assertPpr :: Bool -> SDoc -> a -> a * MASSERT2: massertPpr :: Bool -> SDoc -> m () * ASSERTM2: assertPprM :: m Bool -> SDoc -> m ()
* Ensure assert from Control.Exception isn't usedSylvain Henry2021-05-121-2/+2
|
* Move GlobalVar macros into GHC.Utils.GlobalVarsSylvain Henry2021-05-121-1/+2
| | | | | That's the only place where they are used and they shouldn't be used elsewhere.