summaryrefslogtreecommitdiff
path: root/compiler
Commit message (Collapse)AuthorAgeFilesLines
* CmmToLlvm: Don't aliasify builtin LLVM variableswip/llvm-initializersBen Gamari2022-08-121-4/+16
| | | | | | | | | Our aliasification logic would previously turn builtin LLVM variables into aliases, which apparently confuses LLVM. This manifested in initializers failing to be emitted, resulting in many profiling failures with the LLVM backend. Fixes #22019.
* ncg/aarch64: Don't use x18 register on AArch64/Darwinnormalcoder2022-08-101-0/+8
| | | | | | | | | Apple's ABI documentation [1] says: "The platforms reserve register x18. Don’t use this register." While this wasn't problematic in previous Darwin releases, macOS 13 appears to start zeroing this register periodically. See #21964. [1] https://developer.apple.com/documentation/xcode/writing-arm64-code-for-apple-platforms
* Add support for external static plugins (#20964)Sylvain Henry2022-08-105-10/+219
| | | | | | | | | | | | | | | | | | | | This patch adds a new command-line flag: -fplugin-library=<file-path>;<unit-id>;<module>;<args> used like this: -fplugin-library=path/to/plugin.so;package-123;Plugin.Module;["Argument","List"] It allows a plugin to be loaded directly from a shared library. With this approach, GHC doesn't compile anything for the plugin and doesn't load any .hi file for the plugin and its dependencies. As such GHC doesn't need to support two environments (one for plugins, one for target code), which was the more ambitious approach tracked in #14335. Fix #20964 Co-authored-by: Josh Meredith <joshmeredith2008@gmail.com>
* Fix size_up_alloc to account for UnliftedDatatypessheaf2022-08-091-4/+3
| | | | | | | | The size_up_alloc function mistakenly considered any type that isn't lifted to not allocate anything, which is wrong. What we want instead is to check the type isn't boxed. This accounts for (BoxedRep Unlifted). Fixes #21939
* Cleanups around pretty-printingKrzysztof Gogolewski2022-08-0912-62/+27
| | | | | | | | | | * Remove hack when printing OccNames. No longer needed since e3dcc0d5 * Remove unused `pprCmms` and `instance Outputable Instr` * Simplify `pprCLabel` (no need to pass platform) * Remove evil `Show`/`Eq` instances for `SDoc`. They were needed by ImmLit, but that can take just a String instead. * Remove instance `Outputable CLabel` - proper output of labels needs a platform, and is done by the `OutputableP` instance
* dataToTag#: Skip runtime tag check if argument is infered taggedAndreas Klebinger2022-08-082-11/+37
| | | | This addresses one part of #21710.
* NCG(x86): Compile add+shift as lea if possible.wip/andreask/add_mul_leaAndreas Klebinger2022-08-081-0/+36
|
* Add a primop to query the label of a threadBen Gamari2022-08-062-0/+11
|
* rts: Move thread labels into TSOBen Gamari2022-08-061-1/+3
| | | | | | | This eliminates the thread label HashTable and instead tracks this information in the TSO, allowing us to use proper StgArrBytes arrays for backing the label and greatly simplifying management of object lifetimes when we expose them to the user with the coming `threadLabel#` primop.
* Add primop to list threadsBen Gamari2022-08-062-0/+21
| | | | | | | A user came to #ghc yesterday wondering how best to check whether they were leaking threads. We ended up using the eventlog but it seems to me like it would be generally useful if Haskell programs could query their own threads.
* compiler: Eliminate two uses of foldr in favor of foldl'Ben Gamari2022-08-062-2/+2
| | | | | | These two uses constructed maps, which is a case where foldl' is generally more efficient since we avoid constructing an intermediate O(n)-depth stack.
* Change `-fprof-late` to insert cost centres after unfolding creation.Andreas Klebinger2022-08-0610-38/+164
| | | | | | | | | | | | | | | | | | | | | | | | | | The former behaviour of adding cost centres after optimization but before unfoldings are created is not available via the flag `prof-late-inline` instead. I also reduced the overhead of -fprof-late* by pushing the cost centres into lambdas. This means the cost centres will only account for execution of functions and not their partial application. Further I made LATE_CC cost centres it's own CC flavour so they now won't clash with user defined ones if a user uses the same string for a custom scc. LateCC: Don't put cost centres inside constructor workers. With -fprof-late they are rarely useful as the worker is usually inlined. Even if the worker is not inlined or we use -fprof-late-linline they are generally not helpful but bloat compile and run time significantly. So we just don't add sccs inside constructor workers. ------------------------- Metric Decrease: T13701 -------------------------
* StgToCmm: Fix isSimpleScrut when profiling is enabled.Andreas Klebinger2022-08-063-27/+32
| | | | | | | | | | | When profiling is enabled we must enter functions that might represent thunks in order for their sccs to show up in the profile. We might allocate even if the function is already evaluated in this case. So we can't consider any potential function thunk to be a simple scrut when profiling. Not doing so caused profiled binaries to segfault.
* Make dropTail comment a haddock commentAndreas Klebinger2022-08-061-1/+1
|
* codeGen/X86: Don't clobber switch variable in switch generationBen Gamari2022-08-051-2/+3
| | | | | | | | | Previously ce8745952f99174ad9d3bdc7697fd086b47cdfb5 assumed that it was safe to clobber the switch variable when generating code for a jump table since we were at the end of a block. However, this assumption is wrong; the register could be live in the jump target. Fixes #21968.
* findExternalRules: Don't needlessly traverse the list of rules.Andreas Klebinger2022-08-041-4/+3
|
* cmm: Remove unused ReadOnlyData16Cheng Shao2022-08-045-16/+0
| | | | We don't actually emit rodata16 sections anywhere.
* Store interfaces in ModIfaceCache more directlyMatthew Pickering2022-08-042-73/+49
| | | | | | | | I realised hydration was completely irrelavant for this cache because the ModDetails are pruned from the result. So now it simplifies things a lot to just store the ModIface and Linkable, which we can put into the cache straight away rather than wait for the final version of a HomeModInfo to appear.
* Fix leaks in --make mode when there are module loopsMatthew Pickering2022-08-046-96/+307
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes quite a tricky leak where we would end up retaining stale ModDetails due to rehydrating modules against non-finalised interfaces. == Loops with multiple boot files It is possible for a module graph to have a loop (SCC, when ignoring boot files) which requires multiple boot files to break. In this case we must perform the necessary hydration steps before and after compiling modules which have boot files which are described above for corectness but also perform an additional hydration step at the end of the SCC to remove space leaks. Consider the following example: ┌───────┐ ┌───────┐ │ │ │ │ │ A │ │ B │ │ │ │ │ └─────┬─┘ └───┬───┘ │ │ ┌────▼─────────▼──┐ │ │ │ C │ └────┬─────────┬──┘ │ │ ┌────▼──┐ ┌───▼───┐ │ │ │ │ │ A-boot│ │ B-boot│ │ │ │ │ └───────┘ └───────┘ A, B and C live together in a SCC. Say we compile the modules in order A-boot, B-boot, C, A, B then when we compile A we will perform the hydration steps (because A has a boot file). Therefore C will be hydrated relative to A, and the ModDetails for A will reference C/A. Then when B is compiled C will be rehydrated again, and so B will reference C/A,B, its interface will be hydrated relative to both A and B. Now there is a space leak because say C is a very big module, there are now two different copies of ModDetails kept alive by modules A and B. The way to avoid this space leak is to rehydrate an entire SCC together at the end of compilation so that all the ModDetails point to interfaces for .hs files. In this example, when we hydrate A, B and C together then both A and B will refer to C/A,B. See #21900 for some more discussion. ------------------------------------------------------- In addition to this simple case, there is also the potential for a leak during parallel upsweep which is also fixed by this patch. Transcibed is Note [ModuleNameSet, efficiency and space leaks] Note [ModuleNameSet, efficiency and space leaks] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ During unsweep the results of compiling modules are placed into a MVar, to find the environment the module needs to compile itself in the MVar is consulted and the HomeUnitGraph is set accordingly. The reason we do this is that precisely tracking module dependencies and recreating the HUG from scratch each time is very expensive. In serial mode (-j1), this all works out fine because a module can only be compiled after its dependencies have finished compiling and not interleaved with compiling module loops. Therefore when we create the finalised or no loop interfaces, the HUG only contains finalised interfaces. In parallel mode, we have to be more careful because the HUG variable can contain non-finalised interfaces which have been started by another thread. In order to avoid a space leak where a finalised interface is compiled against a HPT which contains a non-finalised interface we have to restrict the HUG to only the visible modules. The visible modules is recording in the ModuleNameSet, this is propagated upwards whilst compiling and explains which transitive modules are visible from a certain point. This set is then used to restrict the HUG before the module is compiled to only the visible modules and thus avoiding this tricky space leak. Efficiency of the ModuleNameSet is of utmost importance because a union occurs for each edge in the module graph. Therefore the set is represented directly as an IntSet which provides suitable performance, even using a UniqSet (which is backed by an IntMap) is too slow. The crucial test of performance here is the time taken to a do a no-op build in --make mode. See test "jspace" for an example which used to trigger this problem. Fixes #21900
* Force name selectors to ensure no reference to Ids enter the NameCacheMatthew Pickering2022-08-041-2/+2
| | | | | | I observed some unforced thunks in the NameCache which were retaining a whole Id, which ends up retaining a Type.. which ends up retaining old copies of HscEnv containing stale HomeModInfo.
* Force safeInferred to avoid retaining extra copy of DynFlagsMatthew Pickering2022-08-041-1/+2
| | | | | | This will only have a (very) modest impact on memory but we don't want to retain old copies of DynFlags hanging around so best to force this value.
* Add a note about about W/W for unlifting strict argumentsAndreas Klebinger2022-08-042-1/+48
| | | | This fixes #21236.
* Fix TH + defer-type-errors interaction (#21920)Krzysztof Gogolewski2022-08-041-6/+0
| | | | | | Previously, we had to disable defer-type-errors in splices because of #7276. But this fix is no longer necessary, the test T7276 no longer segfaults and is now correctly deferred.
* Remove TCvSubst and use Subst for both term and type-level substYiyun Liu2022-08-0447-718/+642
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes the TCvSubst data type and instead uses Subst as the environment for both term and type level substitution. This change is partially motivated by the existential type proposal, which will introduce types that contain expressions and therefore forces us to carry around an "IdSubstEnv" even when substituting for types. It also reduces the amount of code because "Subst" and "TCvSubst" share a lot of common operations. There isn't any noticeable impact on performance (geo. mean for ghc/alloc is around 0.0% but we have -94 loc and one less data type to worry abount). Currently, the "TCvSubst" data type for substitution on types is identical to the "Subst" data type except the former doesn't store "IdSubstEnv". Using "Subst" for type-level substitution means there will be a redundant field stored in the data type. However, in cases where the substitution starts from the expression, using "Subst" for type-level substitution saves us from having to project "Subst" into a "TCvSubst". This probably explains why the allocation is mostly even despite the redundant field. The patch deletes "TCvSubst" and moves "Subst" and its relevant functions from "GHC.Core.Subst" into "GHC.Core.TyCo.Subst". Substitution on expressions is still defined in "GHC.Core.Subst" so we don't have to expose the definition of "Expr" in the hs-boot file that "GHC.Core.TyCo.Subst" must import to refer to "IdSubstEnv" (whose codomain is "CoreExpr"). Most functions named fooTCvSubst are renamed into fooSubst with a few exceptions (e.g. "isEmptyTCvSubst" is a distinct function from "isEmptySubst"; the former ignores the emptiness of "IdSubstEnv"). These exceptions mainly exist for performance reasons and will go away when "Expr" and "Type" are mutually recursively defined (we won't be able to take those shortcuts if we can't make the assumption that expressions don't appear in types).
* Add -dsuppress-coercion-types to make coercions even smaller.Andreas Klebinger2022-08-024-3/+14
| | | | | Instead of `` `cast` <Co:11> :: (Some -> Really -> Large Type)`` simply print `` `cast` <Co:11> :: ... ``
* driver: Don't create LinkNodes when -no-link is enabledMatthew Pickering2022-07-281-1/+1
| | | | Fixes #21866
* Regression test for #21848Simon Peyton Jones2022-07-261-1/+1
|
* Avoid as pipeline when compiling cCheng Shao2022-07-253-14/+22
|
* Add location to cc phaseCheng Shao2022-07-253-6/+6
|
* Get the in-scope set right in FamInstEnv.injectiveBranchesSimon Peyton Jones2022-07-2513-130/+159
| | | | | | | | | | | | | There was an assert error, as Gergo pointed out in #21896. I fixed this by adding an InScopeSet argument to tcUnifyTyWithTFs. And also to GHC.Core.Unify.niFixTCvSubst. I also took the opportunity to get a couple more InScopeSets right, and to change some substTyUnchecked into substTy. This MR touches a lot of other files, but only because I also took the opportunity to introduce mkInScopeSetList, and use it.
* Teach SpecConstr about typeDeterminesValueSimon Peyton Jones2022-07-256-140/+196
| | | | | | | | | | | | | | | | | | | | | | This patch addresses #21831, point 2. See Note [generaliseDictPats] in SpecConstr I took the opportunity to refactor the construction of specialisation rules a bit, so that the rule name says what type we are specialising at. Surprisingly, there's a 20% decrease in compile time for test perf/compiler/T18223. I took a look at it, and the code size seems the same throughout. I did a quick ticky profile which seemed to show a bit less substitution going on. Hmm. Maybe it's the "don't do eta-expansion in stable unfoldings" patch, which is part of the same MR as this patch. Anyway, since it's a move in the right direction, I didn't think it was worth looking into further. Metric Decrease: T18223
* Switch off eta-expansion in rules and unfoldingsSimon Peyton Jones2022-07-251-8/+14
| | | | | | I think this change will make little difference except to reduce clutter. But that's it -- if it causes problems we can switch it on again.
* Fix isEvaldUnfolding and isValueUnfoldingSimon Peyton Jones2022-07-251-0/+2
| | | | This fixes (1) in #21831. Easy, obviously correct.
* Fix a small buglet in tryEtaReduceSimon Peyton Jones2022-07-252-7/+9
| | | | | | | Gergo points out (#21801) that GHC.Core.Opt.Arity.tryEtaReduce was making an ill-formed cast. It didn't matter, because the subsequent guard discarded it; but still worth fixing. Spurious warnings are distracting.
* More improvements to worker/wrapperSimon Peyton Jones2022-07-253-145/+166
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes #21888, and simplifies finaliseArgBoxities by eliminating the (recently introduced) data type FinalDecision. A delicate interaction meant that this patch commit d1c25a48154236861a413e058ea38d1b8320273f Date: Tue Jul 12 16:33:46 2022 +0100 Refactor wantToUnboxArg a bit make worker/wrapper go into an infinite loop. This patch fixes it by narrowing the handling of case (B) of Note [Boxity for bottoming functions], to deal only the arguemnts that are type variables. Only then do we drop the trimBoxity call, which is what caused the bug. I also * Added documentation of case (B), which was previously completely un-mentioned. And a regression test, T21888a, to test it. * Made unboxDeeplyDmd stop at lazy demands. It's rare anyway for a bottoming function to have a lazy argument (mainly when the data type is recursive and then we don't want to unbox deeply). Plus there is Note [No lazy, Unboxed demands in demand signature] * Refactored the Case equation for dmdAnal a bit, to do less redundant pattern matching.
* Fix the interaction of operator sections and deep subsumptionSimon Peyton Jones2022-07-253-8/+50
| | | | Fixes DeepSubsumption08
* Add tests that -XHaskell98 and -XHaskell2010 enable DeepSubsumptionMatthew Pickering2022-07-251-1/+3
|
* Implement DeepSubsumptionSimon Peyton Jones2022-07-2512-175/+606
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This MR adds the language extension -XDeepSubsumption, implementing GHC proposal #511. This change mitigates the impact of GHC proposal The changes are highly localised, by design. See Note [Deep subsumption] in GHC.Tc.Utils.Unify. The main changes are: * Add -XDeepSubsumption, which is on by default in Haskell98 and Haskell2010, but off in Haskell2021. -XDeepSubsumption largely restores the behaviour before the "simple subsumption" change. -XDeepSubsumpition has a similar flavour as -XNoMonoLocalBinds: it makes type inference more complicated and less predictable, but it may be convenient in practice. * The main changes are in: * GHC.Tc.Utils.Unify.tcSubType, which does deep susumption and eta-expanansion * GHC.Tc.Utils.Unify.tcSkolemiseET, which does deep skolemisation * In GHC.Tc.Gen.App.tcApp we call tcSubTypeNC to match the result type. Without deep subsumption, unifyExpectedType would be sufficent. See Note [Deep subsumption] in GHC.Tc.Utils.Unify. * There are no changes to Quick Look at all. * The type of `withDict` becomes ambiguous; so add -XAllowAmbiguousTypes to GHC.Magic.Dict * I fixed a small but egregious bug in GHC.Core.FVs.varTypeTyCoFVs, where we'd forgotten to take the free vars of the multiplicity of an Id. * I also had to fix tcSplitNestedSigmaTys When I did the shallow-subsumption patch commit 2b792facab46f7cdd09d12e79499f4e0dcd4293f Date: Sun Feb 2 18:23:11 2020 +0000 Simple subsumption I changed tcSplitNestedSigmaTys to not look through function arrows any more. But that was actually an un-forced change. This function is used only in * Improving error messages in GHC.Tc.Gen.Head.addFunResCtxt * Validity checking for default methods: GHC.Tc.TyCl.checkValidClass * A couple of calls in the GHCi debugger: GHC.Runtime.Heap.Inspect All to do with validity checking and error messages. Acutally its fine to look under function arrows here, and quite useful a test DeepSubsumption05 (a test motivated by a build failure in the `lens` package) shows. The fix is easy. I added Note [tcSplitNestedSigmaTys].
* ghc-boot: Clean up UTF-8 codecsBen Gamari2022-07-226-13/+13
| | | | | | | | In preparation for moving the UTF-8 codecs into `base`: * Move them to GHC.Utils.Encoding.UTF8 * Make names more consistent * Add some Haddocks
* Make the specialiser deal better with specialised methodsSimon Peyton Jones2022-07-222-187/+230
| | | | | | | | | | | This patch fixes #21848, by being more careful to update unfoldings in the type-class specialiser. See the new Note [Update unfolding after specialisation] Now that we are being so much more careful about unfoldings, it turned out that I could dispense with se_interesting, and all its tricky corners. Hooray. This fixes #21368.
* Refactored Simplify passDominik Peteler2022-07-2226-5664/+5906
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Removed references to driver from GHC.Core.LateCC, GHC.Core.Simplify namespace and GHC.Core.Opt.Stats. Also removed services from configuration records. * Renamed GHC.Core.Opt.Simplify to GHC.Core.Opt.Simplify.Iteration. * Inlined `simplifyPgm` and renamed `simplifyPgmIO` to `simplifyPgm` and moved the Simplify driver to GHC.Core.Opt.Simplify. * Moved `SimplMode` and `FloatEnable` to GHC.Core.Opt.Simplify.Env. * Added a configuration record `TopEnvConfig` for the `SimplTopEnv` environment in GHC.Core.Opt.Simplify.Monad. * Added `SimplifyOpts` and `SimplifyExprOpts`. Provide initialization functions for those in a new module GHC.Driver.Config.Core.Opt.Simplify. Also added initialization functions for `SimplMode` to that module. * Moved `CoreToDo` and friends to a new module GHC.Core.Pipeline.Types and the counting types and functions (`SimplCount` and `Tick`) to new module GHC.Core.Opt.Stats. * Added getter functions for the fields of `SimplMode`. The pedantic bottoms option and the platform are retrieved from the ArityOpts and RuleOpts and the getter functions allow us to retrieve values from `SpecEnv` without the knowledge where the data is stored exactly. * Moved the coercion optimization options from the top environment to `SimplMode`. This way the values left in the top environment are those dealing with monadic functionality, namely logging, IO related stuff and counting. Added a note "The environments of the Simplify pass". * Removed `CoreToDo` from GHC.Core.Lint and GHC.CoreToStg.Prep and got rid of `CoreDoSimplify`. Pass `SimplifyOpts` in the `CoreToDo` type instead. * Prep work before removing `InteractiveContext` from `HscEnv`.
* Make withDict opaque to the specialisersheaf2022-07-214-46/+95
| | | | | | | | | | | | | | | | | As pointed out in #21575, it is not sufficient to set withDict to inline after the typeclass specialiser, because we might inline withDict in one module and then import it in another, and we run into the same problem. This means we could still end up with incorrect runtime results because the typeclass specialiser would assume that distinct typeclass evidence terms at the same type are equal, when this is not necessarily the case when using withDict. Instead, this patch introduces a new magicId, 'nospec', which is only inlined in CorePrep. We make use of it in the definition of withDict to ensure that the typeclass specialiser does not common up distinct typeclass evidence terms. Fixes #21575
* driver: Fix implementation of -SMatthew Pickering2022-07-191-14/+17
| | | | | | | We were failing to stop before running the assembler so the object file was also created. Fixes #21869
* Refactor wantToUnboxArg a bitSimon Peyton Jones2022-07-184-174/+217
| | | | | | | | | | | | | | | | | | | * Rename GHC.Core.Opt.WorkWrap.Utils.wantToUnboxArg to canUnboxArg and similarly wantToUnboxResult to canUnboxResult. * Add GHC.Core.Opt.DmdAnal.wantToUnboxArg as a wrapper for the (new) GHC.Core.Opt.WorkWrap.Utils.canUnboxArg, avoiding some yukky duplication. I decided it was clearer to give it a new data type for its return type, because I nedeed the FD_RecBox case which was not otherwise readiliy expressible. * Add dcpc_args to WorkWrap.Utils.DataConPatContext for the payload * Get rid of the Unlift constructor of UnboxingDecision, eliminate two panics, and two arguments to canUnboxArg (new name). Much nicer now.
* Make SetLevels honour floatConstsSimon Peyton Jones2022-07-181-4/+4
| | | | | | | | This fix, in the definition of profitableFloat, is just for consistency. `floatConsts` should do what it says! I don't think it'll affect anything much, though.
* Inline mapAccumLMSimon Peyton Jones2022-07-181-0/+5
| | | | | | | | | This function is called in inner loops in the compiler, and it's overloaded and higher order. Best just to inline it. This popped up when I was looking at something else. I think perhaps GHC is delicately balanced on the cusp of inlining this automatically.
* Make transferPolyIdInfo work for CPRSimon Peyton Jones2022-07-182-5/+14
| | | | I don't know why this hasn't bitten us before, but it was plain wrong.
* White space only in FamInstEnvSimon Peyton Jones2022-07-181-30/+34
|
* Rule matching: Don't compute the FVs if we don't look at them.Andreas Klebinger2022-07-182-3/+17
|
* Refactor SpecConstr to use treat bindings uniformlySimon Peyton Jones2022-07-181-264/+260
| | | | | | | | | | | | | | | | | | | | | | This patch, provoked by #21457, simplifies SpecConstr by treating top-level and nested bindings uniformly (see the new scBind). * Eliminates the mysterious scTopBindEnv * Refactors scBind to handle top-level and nested definitions uniformly. * But, for now at least, continues the status quo of not doing SpecConstr for top-level non-recursive bindings. (In contrast we do specialise nested non-recursive bindings, although the original paper did not; see Note [Local let bindings].) I tried the effect of specialising top-level non-recursive bindings (which is now dead easy to switch on, unlike before) but found some regressions, so I backed off. See !8135. It's a pure refactoring. I think it'll do a better job in a few cases, but there is no regression test.