summaryrefslogtreecommitdiff
path: root/compiler/GHC/Cmm
Commit message (Collapse)AuthorAgeFilesLines
...
* Save the type of breakpoints in the Breakpoint tick in STGLuite Stegeman2021-03-201-1/+1
| | | | | | | | GHCi needs to know the types of all breakpoints, but it's not possible to get the exprType of any expression in STG. This is preparation for the upcoming change to make GHCi bytecode from STG instead of Core.
* Fix some warnings when bootstrapping with GHC 9.0Ryan Scott2021-03-091-1/+1
| | | | | | | | | | | This fixes two classes of warnings that appear when bootstrapping with GHC 9.0: * `ghc-boot.cabal` was using `cabal-version: >=1.22`, which `cabal-install-3.4` now warns about, instead recommending the use of `cabal-version: 1.22`. * Several pattern matches were producing `Pattern match(es) are non-exhaustive` because of incorrect CPP. The pattern-match coverage checker _did_ become smarter in GHC 9.1, however, so I ended up needing to keep the CPP, adjusting them to use `#if __GLASGOW_HASKELL__ < 901` instead.
* Add option to give each usage of a data constructor its own info tableMatthew Pickering2021-03-031-15/+55
| | | | | | | | | | | | | The `-fdistinct-constructor-tables` flag will generate a fresh info table for the usage of any data constructor. This is useful for debugging as now by inspecting the info table, you can determine which usage of a constructor caused that allocation rather than the old situation where the info table always mapped to the definition site of the data constructor which is useless. In conjunction with `-hi` and `-finfo-table-map` this gives a more fine grained understanding of where constructor allocations arise from in a program.
* Add -finfo-table-map which maps info tables to source positionsMatthew Pickering2021-03-032-7/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This new flag embeds a lookup table from the address of an info table to information about that info table. The main interface for consulting the map is the `lookupIPE` C function > InfoProvEnt * lookupIPE(StgInfoTable *info) The `InfoProvEnt` has the following structure: > typedef struct InfoProv_{ > char * table_name; > char * closure_desc; > char * ty_desc; > char * label; > char * module; > char * srcloc; > } InfoProv; > > typedef struct InfoProvEnt_ { > StgInfoTable * info; > InfoProv prov; > struct InfoProvEnt_ *link; > } InfoProvEnt; The source positions are approximated in a similar way to the source positions for DWARF debugging information. They are only approximate but in our experience provide a good enough hint about where the problem might be. It is therefore recommended to use this flag in conjunction with `-g<n>` for more accurate locations. The lookup table is also emitted into the eventlog when it is available as it is intended to be used with the `-hi` profiling mode. Using this flag will significantly increase the size of the resulting object file but only by a factor of 2-3x in our experience.
* Reimplement Stream in "yoneda" style for efficiencyMatthew Pickering2021-02-261-9/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | 'Stream' is implemented in the "yoneda" style for efficiency. By representing a stream in this manner 'fmap' and '>>=' operations are accumulated in the function parameters before being applied once when the stream is destroyed. In the old implementation each usage of 'mapM' and '>>=' would traverse the entire stream in order to apply the substitution at the leaves. It is well-known for free monads that this representation can improve performance, and the test results demonstrate this for GHC as well. The operation mapAccumL is not used in the compiler and can't be implemented efficiently because it requires destroying and rebuilding the stream. I removed one use of mapAccumL_ which has similar problems but the other use was difficult to remove. In the future it may be worth exploring whether the 'Stream' encoding could be modified further to capture the mapAccumL pattern, and likewise defer the passing of accumulation parameter until the stream is finally consumed. The >>= operation for 'Stream' was a hot-spot in the ticky profile for the "ManyConstructors" test which called the 'cg' function many times in "StgToCmm.hs" Metric Decrease: ManyConstructors
* Make CmmType field of LocalReg strictMatthew Pickering2021-02-221-1/+1
| | | | | This was observed to build up thunks which were forced by using a `-hi` profile and T3294 as a test.
* Make Width field in CmmType strictMatthew Pickering2021-02-221-1/+1
| | | | | This value is eventually forced so don't build up thunks. Observed with T3294 and -hi profile.
* Force gcp in assignArgumentsPosMatthew Pickering2021-02-221-2/+2
| | | | | | I observed this accumulating in the T3294 test only to be eventually forced (by a -hi profile). As it is only word big, forcing it saves quite a bit of allocation.
* Refactor LoggerSylvain Henry2021-02-132-32/+31
| | | | | | | | | | | | | | | | | | | | | Before this patch, the only way to override GHC's default logging behavior was to set `log_action`, `dump_action` and `trace_action` fields in DynFlags. This patch introduces a new Logger abstraction and stores it in HscEnv instead. This is part of #17957 (avoid storing state in DynFlags). DynFlags are duplicated and updated per-module (because of OPTIONS_GHC pragma), so we shouldn't store global state in them. This patch also fixes a race in parallel "--make" mode which updated the `generatedDumps` IORef concurrently. Bump haddock submodule The increase in MultilayerModules is tracked in #19293. Metric Increase: MultiLayerModules
* Fix typosBrian Wignall2021-02-062-2/+2
|
* Add explicit import lists to Data.List importsOleg Grenrus2021-01-291-1/+1
| | | | | | | | | | | | | Related to a future change in Data.List, https://downloads.haskell.org/ghc/8.10.3/docs/html/users_guide/using-warnings.html?highlight=wcompat#ghc-flag--Wcompat-unqualified-imports Companion pull&merge requests: - https://github.com/judah/haskeline/pull/153 - https://github.com/haskell/containers/pull/762 - https://gitlab.haskell.org/ghc/packages/hpc/-/merge_requests/9 After these the actual change in Data.List should be easy to do.
* C-- shift amount is always native size, not shiftee sizeJohn Ericson2021-01-221-2/+2
| | | | | This isn't a bug yet, because we only shift native-sized types, but I hope to change that.
* Rename parser Error and Warning typesAlfredo Di Napoli2020-12-183-9/+9
| | | | | | | | | This commit renames parser's Error and Warning types (and their constructors) to have a 'Ps' prefix, so that this would play nicely when more errors and warnings for other phases of the pipeline will be added. This will make more explicit which is the particular type of error and warning we are dealing with, and will be more informative for users to see in the generated Haddock.
* Move Unit related fields from DynFlags to HscEnvSylvain Henry2020-12-143-15/+14
| | | | | | | | | | | | | The unit database cache, the home unit and the unit state were stored in DynFlags while they ought to be stored in the compiler session state (HscEnv). This patch fixes this. It introduces a new UnitEnv type that should be used in the future to handle separate unit environments (especially host vs target units). Related to #17957 Bump haddock submodule
* GHC.Cmm.Opt: Be stricter in results.Andreas Klebinger2020-12-081-51/+51
| | | | | | | | | | | | | Optimization either returns Nothing if nothing is to be done or `Just <cmmExpr>` otherwise. There is no point in being lazy in `cmmExpr`. We usually inspect this element so the thunk gets forced not long after. We might eliminate it as dead code once in a blue moon but that's not a case worth optimizing for. Overall the impact of this is rather low. As Cmm.Opt doesn't allocate much (compared to the rest of GHC) to begin with.
* Cmm.Sink: Optimize retaining of assignments, live sets.Andreas Klebinger2020-12-083-52/+169
| | | | | | | | | | | | | | | | | | | | | | | | | Sinking requires us to track live local regs after each cmm statement. We used to do this via "Set LocalReg". However we can replace this with a solution based on IntSet which is overall more efficient without losing much. The thing we lose is width of the variables, which isn't used by the sinking pass anyway. I also reworked how we keep assignments to regs mentioned in skipped assignments. I put the details into Note [Keeping assignemnts mentioned in skipped RHSs]. The gist of it is instead of keeping track of it via the use count which is a `IntMap Int` we now use the live regs set (IntSet) which is quite a bit faster. I think it also matches the semantics a lot better. The skipped (not discarded) assignment does in fact keep the regs on it's rhs alive so keeping track of this in the live set seems like the clearer solution as well. Improves allocations for T3294 by yet another 1%.
* Cmm: Make a few types and utility function slightly stricter.Andreas Klebinger2020-12-082-9/+11
| | | | | | About 0.6% reduction in allocations for the code I was looking at. Not a huge difference but no need to throw away performance.
* CmmSink: Force inlining of foldRegsDefdAndreas Klebinger2020-12-081-6/+45
| | | | | Helps avoid allocating the folding function. Improves perf for T3294 by about 1%.
* CodeGen: Make folds User/DefinerOfRegs INLINEABLE.Andreas Klebinger2020-12-082-0/+7
| | | | | | | | | Reduces allocation for the test case I was looking at by about 1.2%. Mostly from avoiding allocation of some folding functions which turn into let-no-escape bindings which just reuse their environment instead. We also force inlining in a few key places in CmmSink which helps a bit more.
* Remove flattening variablesRichard Eisenberg2020-12-011-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch redesigns the flattener to simplify type family applications directly instead of using flattening meta-variables and skolems. The key new innovation is the CanEqLHS type and the new CEqCan constraint (Ct). A CanEqLHS is either a type variable or exactly-saturated type family application; either can now be rewritten using a CEqCan constraint in the inert set. Because the flattener no longer reduces all type family applications to variables, there was some performance degradation if a lengthy type family application is now flattened over and over (not making progress). To compensate, this patch contains some extra optimizations in the flattener, leading to a number of performance improvements. Close #18875. Close #18910. There are many extra parts of the compiler that had to be affected in writing this patch: * The family-application cache (formerly the flat-cache) sometimes stores coercions built from Given inerts. When these inerts get kicked out, we must kick out from the cache as well. (This was, I believe, true previously, but somehow never caused trouble.) Kicking out from the cache requires adding a filterTM function to TrieMap. * This patch obviates the need to distinguish "blocking" coercion holes from non-blocking ones (which, previously, arose from CFunEqCans). There is thus some simplification around coercion holes. * Extra commentary throughout parts of the code I read through, to preserve the knowledge I gained while working. * A change in the pure unifier around unifying skolems with other types. Unifying a skolem now leads to SurelyApart, not MaybeApart, as documented in Note [Binding when looking up instances] in GHC.Core.InstEnv. * Some more use of MCoercion where appropriate. * Previously, class-instance lookup automatically noticed that e.g. C Int was a "unifier" to a target [W] C (F Bool), because the F Bool was flattened to a variable. Now, a little more care must be taken around checking for unifying instances. * Previously, tcSplitTyConApp_maybe would split (Eq a => a). This is silly, because (=>) is not a tycon in Haskell. Fixed now, but there are some knock-on changes in e.g. TrieMap code and in the canonicaliser. * New function anyFreeVarsOf{Type,Co} to check whether a free variable satisfies a certain predicate. * Type synonyms now remember whether or not they are "forgetful"; a forgetful synonym drops at least one argument. This is useful when flattening; see flattenView. * The pattern-match completeness checker invokes the solver. This invocation might need to look through newtypes when checking representational equality. Thus, the desugarer needs to keep track of the in-scope variables to know what newtype constructors are in scope. I bet this bug was around before but never noticed. * Extra-constraints wildcards are no longer simplified before printing. See Note [Do not simplify ConstraintHoles] in GHC.Tc.Solver. * Whether or not there are Given equalities has become slightly subtler. See the new HasGivenEqs datatype. * Note [Type variable cycles in Givens] in GHC.Tc.Solver.Canonical explains a significant new wrinkle in the new approach. * See Note [What might match later?] in GHC.Tc.Solver.Interact, which explains the fix to #18910. * The inert_count field of InertCans wasn't actually used, so I removed it. Though I (Richard) did the implementation, Simon PJ was very involved in design and review. This updates the Haddock submodule to avoid #18932 by adding a type signature. ------------------------- Metric Decrease: T12227 T5030 T9872a T9872b T9872c Metric Increase: T9872d -------------------------
* Move core flattening algorithm to Core.UnifyRichard Eisenberg2020-12-011-1/+1
| | | | | | | | | | This sets the stage for a later change, where this algorithm will be needed from GHC.Core.InstEnv. This commit also splits GHC.Core.Map into GHC.Core.Map.Type and GHC.Core.Map.Expr, in order to avoid module import cycles with GHC.Core.
* Small optimization to CmmSink.Andreas Klebinger2020-11-281-4/+11
| | | | | | | | Inside `regsUsedIn` we can avoid some thunks by specializing the recursion. In particular we avoid the thunk for `(f e z)` in the MachOp/Load branches, where we know this will evaluate to z. Reduces allocations for T3294 by ~1%.
* [Sized Cmm] properly retain sizes.Moritz Angermann2020-11-261-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | This replaces all Word<N> = W<N># Word# and Int<N> = I<N># Int# with Word<N> = W<N># Word<N># and Int<N> = I<N># Int<N>#, thus providing us with properly sized primitives in the codegenerator instead of pretending they are all full machine words. This came up when implementing darwinpcs for arm64. The darwinpcs reqires us to pack function argugments in excess of registers on the stack. While most procedure call standards (pcs) assume arguments are just passed in 8 byte slots; and thus the caller does not know the exact signature to make the call, darwinpcs requires us to adhere to the prototype, and thus have the correct sizes. If we specify CInt in the FFI call, it should correspond to the C int, and not just be Word sized, when it's only half the size. This does change the expected output of T16402 but the new result is no less correct as it eliminates the narrowing (instead of the `and` as was previously done). Bumps the array, bytestring, text, and binary submodules. Co-Authored-By: Ben Gamari <ben@well-typed.com> Metric Increase: T13701 T14697
* nativeGen/dwarf: Fix procedure end addressesBen Gamari2020-11-151-0/+5
| | | | | | | | | | | | Previously the `.debug_aranges` and `.debug_info` (DIE) DWARF information would claim that procedures (represented with a `DW_TAG_subprogram` DIE) would only span the range covered by their entry block. This omitted all of the continuation blocks (represented by `DW_TAG_lexical_block` DIEs), confusing `perf`. Fix this by introducing a end-of-procedure label and using this as the `DW_AT_high_pc` of procedure `DW_TAG_subprogram` DIEs Fixes #17605.
* codeGen: Produce local symbols for module-internal functionsBen Gamari2020-11-111-0/+34
| | | | | | | | | | | | | | | | | | | | It turns out that some important native debugging/profiling tools (e.g. perf) rely only on symbol tables for function name resolution (as opposed to using DWARF DIEs). However, previously GHC would emit temporary symbols (e.g. `.La42b`) to identify module-internal entities. Such symbols are dropped during linking and therefore not visible to runtime tools (in addition to having rather un-helpful unique names). For instance, `perf report` would often end up attributing all cost to the libc `frame_dummy` symbol since Haskell code was no covered by any proper symbol (see #17605). We now rather follow the model of C compilers and emit descriptively-named local symbols for module internal things. Since this will increase object file size this behavior can be disabled with the `-fno-expose-internal-symbols` flag. With this `perf record` can finally be used against Haskell executables. Even more, with `-g3` `perf annotate` provides inline source code.
* Move this_module into NCGConfigBen Gamari2020-11-112-6/+6
| | | | | | In various places in the NCG we need the Module currently being compiled. Let's move this into the environment instead of chewing threw another register.
* Add the proper HLint rules and remove redundant keywords from compilerHécate2020-11-014-263/+262
|
* Split GHC.Driver.TypesSylvain Henry2020-10-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I was working on making DynFlags stateless (#17957), especially by storing loaded plugins into HscEnv instead of DynFlags. It turned out to be complicated because HscEnv is in GHC.Driver.Types but LoadedPlugin isn't: it is in GHC.Driver.Plugins which depends on GHC.Driver.Types. I didn't feel like introducing yet another hs-boot file to break the loop. Additionally I remember that while we introduced the module hierarchy (#13009) we talked about splitting GHC.Driver.Types because it contained various unrelated types and functions, but we never executed. I didn't feel like making GHC.Driver.Types bigger with more unrelated Plugins related types, so finally I bit the bullet and split GHC.Driver.Types. As a consequence this patch moves a lot of things. I've tried to put them into appropriate modules but nothing is set in stone. Several other things moved to avoid loops. * Removed Binary instances from GHC.Utils.Binary for random compiler things * Moved Typeable Binary instances into GHC.Utils.Binary.Typeable: they import a lot of things that users of GHC.Utils.Binary don't want to depend on. * put everything related to Units/Modules under GHC.Unit: GHC.Unit.Finder, GHC.Unit.Module.{ModGuts,ModIface,Deps,etc.} * Created several modules under GHC.Types: GHC.Types.Fixity, SourceText, etc. * Split GHC.Utils.Error (into GHC.Types.Error) * Finally removed GHC.Driver.Types Note that this patch doesn't put loaded plugins into HscEnv. It's left for another patch. Bump haddock submodule
* cmm: Add Note reference to ForeignHintBen Gamari2020-10-231-0/+2
|
* Remove pdocPrecSylvain Henry2020-10-191-12/+17
| | | | | | pdocPrec was only used in GHC.Cmm.DebugBlock.pprUnwindExpr, so remove it. OutputableP becomes a one-function class which might be better for performance.
* Implement -Woperator-whitespace (#18834)Vladislav Zavialov2020-10-191-2/+2
| | | | | | | | | | | | | | This patch implements two related warnings: -Woperator-whitespace-ext-conflict warns on uses of infix operators that would be parsed differently were a particular GHC extension enabled -Woperator-whitespace warns on prefix, suffix, and tight infix uses of infix operators Updates submodules: haddock, containers.
* Parser: don't require the HomeUnitIdSylvain Henry2020-10-133-82/+109
| | | | | | | The HomeUnitId is only used by the Cmm parser and this one has access to the DynFlags, so it can grab the UnitId of the HomeUnit from them. Bump haddock submodule
* Lint the compiler for extraneous LANGUAGE pragmasHécate2020-10-105-15/+7
|
* Use UnitId in the backend instead of UnitSylvain Henry2020-10-091-6/+6
| | | | | | In Cmm we can only have real units identified with an UnitId. Other units (on-the-fly instantiated units and holes) are only used in type-checking backpack sessions that don't produce Cmm.
* Don't import GHC.Unit to reduce the number of dependenciesSylvain Henry2020-10-011-1/+1
|
* Use ADTs for parser errors/warningsSylvain Henry2020-10-013-18/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Haskell and Cmm parsers/lexers now report errors and warnings using ADTs defined in GHC.Parser.Errors. They can be printed using functions in GHC.Parser.Errors.Ppr. Some of the errors provide hints with a separate ADT (e.g. to suggest to turn on some extension). For now, however, hints are not consistent across all messages. For example some errors contain the hints in the main message. I didn't want to change any message with this patch. I expect these changes to be discussed and implemented later. Surprisingly, this patch enhances performance. On CI (x86_64/deb9/hadrian, ghc/alloc): parsing001 -11.5% T13719 -2.7% MultiLayerModules -3.5% Naperian -3.1% Bump haddock submodule Metric Decrease: MultiLayerModules Naperian T13719 parsing001
* Make the parser module less dependent on DynFlagsSylvain Henry2020-09-291-1/+3
| | | | Bump haddock submodule
* Refactor CLabel pretty-printingSylvain Henry2020-09-231-192/+167
| | | | | | | | | | * Don't depend on the selected backend to know if we print Asm or C labels: we already have PprStyle to determine this. Moreover even when a native backend is used (NCG, LLVM) we may want to C headers containing pretty-printed labels, so it wasn't a good predicate anyway. * Make pretty-printing code clearer and avoid partiality
* Generalize OutputablePSylvain Henry2020-09-178-62/+97
| | | | | Add a type parameter for the environment required by OutputableP. It avoids tying Platform with OutputableP.
* Introduce OutputablePSylvain Henry2020-09-1712-222/+232
| | | | | | | | | | | | | | | | | | | | | | | | | Some types need a Platform value to be pretty-printed: CLabel, Cmm types, instructions, etc. Before this patch they had an Outputable instance and the Platform value was obtained via sdocWithDynFlags. It meant that the *renderer* of the SDoc was responsible of passing the appropriate Platform value (e.g. via the DynFlags given to showSDoc). It put the burden of passing the Platform value on the renderer while the generator of the SDoc knows the Platform it is generating the SDoc for and there is no point passing a different Platform at rendering time. With this patch, we introduce a new OutputableP class: class OutputableP a where pdoc :: Platform -> a -> SDoc With this class we still have some polymorphism as we have with `ppr` (i.e. we can use `pdoc` on a variety of types instead of having a dedicated `pprXXX` function for each XXX type). One step closer removing `sdocWithDynFlags` (#10143) and supporting several platforms (#14335).
* DynFlags: don't pass DynFlags to cmmImplementSwitchPlansSylvain Henry2020-09-042-6/+6
|
* DynFlags: use Platform in foldRegs*Sylvain Henry2020-09-048-151/+137
|
* Don't rely on CLabel's Outputable instance in CmmToCSylvain Henry2020-09-041-11/+12
| | | | | This is in preparation of the removal of sdocWithDynFlags (#10143), hence of the refactoring of CLabel's Outputable instance.
* Remove "Ord FastString" instanceSylvain Henry2020-09-011-10/+8
| | | | | | | | | | | | | | | | | | | FastStrings can be compared in 2 ways: by Unique or lexically. We don't want to bless one particular way with an "Ord" instance because it leads to bugs (#18562) or to suboptimal code (e.g. using lexical comparison while a Unique comparison would suffice). UTF-8 encoding has the advantage that sorting strings by their encoded bytes also sorts them by their Unicode code points, without having to decode the actual code points. BUT GHC uses Modified UTF-8 which diverges from UTF-8 by encoding \0 as 0xC080 instead of 0x00 (to avoid null bytes in the middle of a String so that the string can still be null-terminated). This patch adds a new `utf8CompareShortByteString` function that performs sorting by bytes but that also takes Modified UTF-8 into account. It is much more performant than decoding the strings into [Char] to perform comparisons (which we did in the previous patch). Bump haddock submodule
* Import qualified Prelude in Cmm/Parser.yVladislav Zavialov2020-08-211-0/+1
| | | | | | | | In preparation for the next version of 'happy', c95920 added a qualified import to GHC/Parser.y but for some reason neglected GHC/Cmm/Parser.y This patch adds the missing qualified import to GHC/Cmm/Parser.y and also adds a clarifying comment to explain why this import is needed.
* Put CFG weights into their own module (#17957)Sylvain Henry2020-08-211-2/+2
| | | | It avoids having to query DynFlags to get them
* PmCheck: Better long-distance info for where bindings (#18533)Sebastian Graf2020-08-131-0/+3
| | | | | | | | | | | | | | | | | | | | Where bindings can see evidence from the pattern match of the `GRHSs` they belong to, but not from anything in any of the guards (which belong to one of possibly many RHSs). Before this patch, we did *not* consider said evidence, causing #18533, where the lack of considering type information from a case pattern match leads to failure to resolve the vanilla COMPLETE set of a data type. Making available that information required a medium amount of refactoring so that `checkMatches` can return a `[(Deltas, NonEmpty Deltas)]`; one `(Deltas, NonEmpty Deltas)` for each `GRHSs` of the match group. The first component of the pair is the covered set of the pattern, the second component is one covered set per RHS. Fixes #18533. Regression test case: T18533
* DynFlags: disentangle OutputableSylvain Henry2020-08-1216-4/+19
| | | | | | | | | - put panic related functions into GHC.Utils.Panic - put trace related functions using DynFlags in GHC.Driver.Ppr One step closer making Outputable fully independent of DynFlags. Bump haddock submodule
* nativeGen: One approach to fix #18527Ben Gamari2020-08-071-0/+3
| | | | | | | Previously the code generator could produce corrupt C call sequences due to register overlap between MachOp lowerings and the platform's calling convention. We fix this using a hack described in Note [Evaluate C-call arguments before placing in destination registers].
* CmmLint: Check foreign call argument register invariantBen Gamari2020-08-071-5/+35
| | | | | As mentioned in Note [Register parameter passing] the arguments of foreign calls cannot refer to caller-saved registers.