summaryrefslogtreecommitdiff
path: root/compiler/GHC/StgToCmm/DataCon.hs
Commit message (Collapse)AuthorAgeFilesLines
* Cmm: track the type of global registerssheaf2023-01-311-3/+4
| | | | | | | | | | | | This patch tracks the type of Cmm global registers. This is needed in order to lint uses of polymorphic registers, such as SIMD vector registers that can be used both for floating-point and integer values. This changes allows us to refactor VanillaReg to not store VGcPtr, as that information is instead stored in the type of the usage of the register. Fixes #22297
* Improve -dtag-inference-checks checks.Andreas Klebinger2022-03-111-1/+2
| | | | | FUN closures don't get tagged when evaluated. So no point in checking their tags.
* Tag inference work.Andreas Klebinger2022-02-121-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This does three major things: * Enforce the invariant that all strict fields must contain tagged pointers. * Try to predict the tag on bindings in order to omit tag checks. * Allows functions to pass arguments unlifted (call-by-value). The former is "simply" achieved by wrapping any constructor allocations with a case which will evaluate the respective strict bindings. The prediction is done by a new data flow analysis based on the STG representation of a program. This also helps us to avoid generating redudant cases for the above invariant. StrictWorkers are created by W/W directly and SpecConstr indirectly. See the Note [Strict Worker Ids] Other minor changes: * Add StgUtil module containing a few functions needed by, but not specific to the tag analysis. ------------------------- Metric Decrease: T12545 T18698b T18140 T18923 LargeRecord Metric Increase: LargeRecord ManyAlternatives ManyConstructors T10421 T12425 T12707 T13035 T13056 T13253 T13253-spj T13379 T15164 T18282 T18304 T18698a T1969 T20049 T3294 T4801 T5321FD T5321Fun T783 T9233 T9675 T9961 T19695 WWRec -------------------------
* Fix some notesMatthew Pickering2022-02-081-1/+1
|
* StgToCmm: decouple DynFlags, add StgToCmmConfigdoyougnu2022-01-311-32/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | StgToCmm: add Config, remove CgInfoDownwards StgToCmm: runC api change to take StgToCmmConfig StgToCmm: CgInfoDownad -> StgToCmmConfig StgToCmm.Monad: update getters/setters/withers StgToCmm: remove CallOpts in StgToCmm.Closure StgToCmm: remove dynflag references StgToCmm: PtrOpts removed StgToCmm: add TMap to config, Prof - dynflags StgToCmm: add omit yields to config StgToCmm.ExtCode: remove redundant import StgToCmm.Heap: remove references to dynflags StgToCmm: codeGen api change, DynFlags -> Config StgToCmm: remove dynflags in Env and StgToCmm StgToCmm.DataCon: remove dynflags references StgToCmm: remove dynflag references in DataCon StgToCmm: add backend avx flags to config StgToCmm.Prim: remove dynflag references StgToCmm.Expr: remove dynflag references StgToCmm.Bind: remove references to dynflags StgToCmm: move DoAlignSanitisation to Cmm.Type StgToCmm: remove PtrOpts in Cmm.Parser.y DynFlags: update ipInitCode api StgToCmm: Config Module is single source of truth StgToCmm: Lazy config breaks IORef deadlock testsuite: bump countdeps threshold StgToCmm.Config: strictify fields except UpdFrame Strictifying UpdFrameOffset causes the RTS build with stage1 to deadlock. Additionally, before the deadlock performance of the RTS is noticeably slower. StgToCmm.Config: add field descriptions StgToCmm: revert strictify on Module in config testsuite: update CountDeps tests StgToCmm: update comment, fix exports Specifically update comment about loopification passed into dynflags then stored into stgToCmmConfig. And remove getDynFlags from Monad.hs exports Types.Name: add pprFullName function StgToCmm.Ticky: use pprFullname, fixup ExtCode imports Cmm.Info: revert cmmGetClosureType removal StgToCmm.Bind: use pprFullName, Config update comments StgToCmm: update closureDescription api StgToCmm: SAT altHeapCheck StgToCmm: default render for Info table, ticky Use default rendering contexts for info table and ticky ticky, which should be independent of command line input. testsuite: bump count deps pprFullName: flag for ticky vs normal style output convertInfoProvMap: remove unused parameter StgToCmm.Config: add backend flags to config StgToCmm.Config: remove Backend from Config StgToCmm.Prim: refactor Backend call sites StgToCmm.Prim: remove redundant imports StgToCmm.Config: refactor vec compatibility check StgToCmm.Config: add allowQuotRem2 flag StgToCmm.Ticky: print internal names with parens StgToCmm.Bind: dispatch ppr based on externality StgToCmm: Add pprTickyname, Fix ticky naming Accidently removed the ctx for ticky SDoc output. The only relevant flag is sdocPprDebug which was accidental set to False due to using defaultSDocContext without altering the flag. StgToCmm: remove stateful fields in config fixup: config: remove redundant imports StgToCmm: move Sequel type to its own module StgToCmm: proliferate getCallMethod updated api StgToCmm.Monad: add FCodeState to Monad Api StgToCmm: add second reader monad to FCode fixup: Prim.hs: missed a merge conflict fixup: Match countDeps tests to HEAD StgToCmm.Monad: withState -> withCgState To disambiguate it from mtl withState. This withState shouldn't be returning the new state as a value. However, fixing this means tackling the knot tying in CgState and so is very difficult since it changes when the thunk of the knot is forced which either leads to deadlock or to compiler panic.
* Remove useless {-# LANGUAGE CPP #-} pragmasSylvain Henry2021-05-121-1/+1
|
* Fully remove HsVersions.hSylvain Henry2021-05-121-2/+0
| | | | | | | | | | Replace uses of WARN macro with calls to: warnPprTrace :: Bool -> SDoc -> a -> a Remove the now unused HsVersions.h Bump haddock submodule
* Replace CPP assertions with Haskell functionsSylvain Henry2021-05-121-4/+5
| | | | | | | | | | | | | | | There is no reason to use CPP. __LINE__ and __FILE__ macros are now better replaced with GHC's CallStack. As a bonus, assert error messages now contain more information (function name, column). Here is the mapping table (HasCallStack omitted): * ASSERT: assert :: Bool -> a -> a * MASSERT: massert :: Bool -> m () * ASSERTM: assertM :: m Bool -> m () * ASSERT2: assertPpr :: Bool -> SDoc -> a -> a * MASSERT2: massertPpr :: Bool -> SDoc -> m () * ASSERTM2: assertPprM :: m Bool -> SDoc -> m ()
* Add compiler linting to CIHécate2021-03-251-1/+0
| | | | | This commit adds the `lint:compiler` Hadrian target to the CI runner. It does also fixes hints in the compiler/ and libraries/base/ codebases.
* Add option to give each usage of a data constructor its own info tableMatthew Pickering2021-03-031-10/+24
| | | | | | | | | | | | | The `-fdistinct-constructor-tables` flag will generate a fresh info table for the usage of any data constructor. This is useful for debugging as now by inspecting the info table, you can determine which usage of a constructor caused that allocation rather than the old situation where the info table always mapped to the definition site of the data constructor which is useless. In conjunction with `-hi` and `-finfo-table-map` this gives a more fine grained understanding of where constructor allocations arise from in a program.
* [Sized Cmm] properly retain sizes.Moritz Angermann2020-11-261-1/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | This replaces all Word<N> = W<N># Word# and Int<N> = I<N># Int# with Word<N> = W<N># Word<N># and Int<N> = I<N># Int<N>#, thus providing us with properly sized primitives in the codegenerator instead of pretending they are all full machine words. This came up when implementing darwinpcs for arm64. The darwinpcs reqires us to pack function argugments in excess of registers on the stack. While most procedure call standards (pcs) assume arguments are just passed in 8 byte slots; and thus the caller does not know the exact signature to make the call, darwinpcs requires us to adhere to the prototype, and thus have the correct sizes. If we specify CInt in the FFI call, it should correspond to the C int, and not just be Word sized, when it's only half the size. This does change the expected output of T16402 but the new result is no less correct as it eliminates the narrowing (instead of the `and` as was previously done). Bumps the array, bytestring, text, and binary submodules. Co-Authored-By: Ben Gamari <ben@well-typed.com> Metric Increase: T13701 T14697
* Introduce isBoxedTupleDataCon and use it to fix #18644Ryan Scott2020-09-041-1/+1
| | | | | | | | | | | | | | | | | | | | The code that converts promoted tuple data constructors to `IfaceType`s in `GHC.CoreToIface` was using `isTupleDataCon`, which conflates boxed and unboxed tuple data constructors. To avoid this, this patch introduces `isBoxedTupleDataCon`, which is like `isTupleDataCon` but only works for _boxed_ tuple data constructors. While I was in town, I was horribly confused by the fact that there were separate functions named `isUnboxedTupleCon` and `isUnboxedTupleTyCon` (similarly, `isUnboxedSumCon` and `isUnboxedSumTyCon`). It turns out that the former only works for data constructors, despite its very general name! I opted to rename `isUnboxedTupleCon` to `isUnboxedTupleDataCon` (similarly, I renamed `isUnboxedSumCon` to `isUnboxedSumDataCon`) to avoid this potential confusion, as well as to be more consistent with the naming convention I used for `isBoxedTupleDataCon`. Fixes #18644.
* DynFlags: disentangle OutputableSylvain Henry2020-08-121-1/+1
| | | | | | | | | - put panic related functions into GHC.Utils.Panic - put trace related functions using DynFlags in GHC.Driver.Ppr One step closer making Outputable fully independent of DynFlags. Bump haddock submodule
* Remove platform constant wrappersSylvain Henry2020-07-251-22/+31
| | | | | | | | | | Platform constant wrappers took a DynFlags parameter, hence implicitly used the target platform constants. We removed them to allow support for several platforms at once (#14335) and to avoid having to pass the full DynFlags to every function (#17957). Metric Decrease: T4801
* Fix duplicated words and typos in comments and user guideJan Hrček2020-06-281-1/+1
|
* Update compilerSylvain Henry2020-06-171-1/+1
| | | | | | | | | | | | | | | | Thanks to ghc-bignum, the compiler can be simplified: * Types and constructors of Integer and Natural can be wired-in. It means that we don't have to query them from interfaces. It also means that numeric literals don't have to carry their type with them. * The same code is used whatever ghc-bignum backend is enabled. In particular, conversion of bignum literals into final Core expressions is now much more straightforward. Bignum closure inspection too. * GHC itself doesn't depend on any integer-* package anymore * The `integerLibrary` setting is gone.
* Clarify leaf module names for new module hierarchyTakenobu Tani2020-06-101-1/+1
| | | | | | | | | | | | | | | | | | | | | This updates comments only. This patch replaces leaf module names according to new module hierarchy [1][2] as followings: * Expand leaf names to easily find the module path: for instance, `Id.hs` to `GHC.Types.Id`. * Modify leaf names according to new module hierarchy: for instance, `Convert.hs` to `GHC.ThToHs`. * Fix typo: for instance, `GHC.Core.TyCo.Rep.hs` to `GHC.Core.TyCo.Rep` See also !3375 [1]: https://gitlab.haskell.org/ghc/ghc/-/wikis/Make-GHC-codebase-more-modular [2]: https://gitlab.haskell.org/ghc/ghc/issues/13009
* Unit: split and rename modulesSylvain Henry2020-04-301-1/+1
| | | | | | | Introduce GHC.Unit.* hierarchy for everything concerning units, packages and modules. Update Haddock submodule
* Modules: Utils and Data (#13009)Sylvain Henry2020-04-261-5/+5
| | | | | | | Update Haddock submodule Metric Increase: haddock.compiler
* Modules (#13009)Sylvain Henry2020-04-181-1/+1
| | | | | | | | | | | | | | * SysTools * Parser * GHC.Builtin * GHC.Iface.Recomp * Settings Update Haddock submodule Metric Decrease: Naperian parsing001
* Revert accidental change in 9462452Ömer Sinan Ağacan2020-04-031-2/+0
| | | | [ci skip]
* Improve and refactor StgToCmm codegen for DataCons.Andreas Klebinger2020-04-031-77/+170
| | | | | | | | | | | | | | | | We now differentiate three cases of constructor bindings: 1)Bindings which we can "replace" with a reference to an existing closure. Reference the replacement closure when accessing the binding. 2)Bindings which we can "replace" as above. But we still generate a closure which will be referenced by modules importing this binding. 3)For any other binding generate a closure. Then reference it. Before this patch 1) did only apply to local bindings and we didn't do 2) at all.
* Modules: Types (#13009)Sylvain Henry2020-03-291-4/+4
| | | | | | | Update Haddock submodule Metric Increase: haddock.compiler
* Refactoring: use Platform instead of DynFlags when possibleSylvain Henry2020-03-191-3/+4
| | | | | | | | Metric Decrease: ManyConstructors T12707 T13035 T1969
* Modules: Core (#13009)Sylvain Henry2020-03-161-1/+1
| | | | Update submodule: haddock
* Modules: Core (#13009)Sylvain Henry2020-02-261-1/+1
| | | | Update haddock submodule
* Modules: Driver (#13009)Sylvain Henry2020-02-211-1/+1
| | | | submodule updates: nofib, haddock
* Module hierarchy: ByteCode and Runtime (cf #13009)Sylvain Henry2020-02-121-1/+1
| | | | Update haddock submodule
* Do CafInfo/SRT analysis in CmmÖmer Sinan Ağacan2020-01-311-10/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes all CafInfo predictions and various hacks to preserve predicted CafInfos from the compiler and assigns final CafInfos to interface Ids after code generation. SRT analysis is extended to support static data, and Cmm generator is modified to allow generating static_link fields after SRT analysis. This also fixes `-fcatch-bottoms`, which introduces error calls in case expressions in CorePrep, which runs *after* CoreTidy (which is where we decide on CafInfos) and turns previously non-CAFFY things into CAFFY. Fixes #17648 Fixes #9718 Evaluation ========== NoFib ----- Boot with: `make boot mode=fast` Run: `make mode=fast EXTRA_RUNTEST_OPTS="-cachegrind" NoFibRuns=1` -------------------------------------------------------------------------------- Program Size Allocs Instrs Reads Writes -------------------------------------------------------------------------------- CS -0.0% 0.0% -0.0% -0.0% -0.0% CSD -0.0% 0.0% -0.0% -0.0% -0.0% FS -0.0% 0.0% -0.0% -0.0% -0.0% S -0.0% 0.0% -0.0% -0.0% -0.0% VS -0.0% 0.0% -0.0% -0.0% -0.0% VSD -0.0% 0.0% -0.0% -0.0% -0.5% VSM -0.0% 0.0% -0.0% -0.0% -0.0% anna -0.1% 0.0% -0.0% -0.0% -0.0% ansi -0.0% 0.0% -0.0% -0.0% -0.0% atom -0.0% 0.0% -0.0% -0.0% -0.0% awards -0.0% 0.0% -0.0% -0.0% -0.0% banner -0.0% 0.0% -0.0% -0.0% -0.0% bernouilli -0.0% 0.0% -0.0% -0.0% -0.0% binary-trees -0.0% 0.0% -0.0% -0.0% -0.0% boyer -0.0% 0.0% -0.0% -0.0% -0.0% boyer2 -0.0% 0.0% -0.0% -0.0% -0.0% bspt -0.0% 0.0% -0.0% -0.0% -0.0% cacheprof -0.0% 0.0% -0.0% -0.0% -0.0% calendar -0.0% 0.0% -0.0% -0.0% -0.0% cichelli -0.0% 0.0% -0.0% -0.0% -0.0% circsim -0.0% 0.0% -0.0% -0.0% -0.0% clausify -0.0% 0.0% -0.0% -0.0% -0.0% comp_lab_zift -0.0% 0.0% -0.0% -0.0% -0.0% compress -0.0% 0.0% -0.0% -0.0% -0.0% compress2 -0.0% 0.0% -0.0% -0.0% -0.0% constraints -0.0% 0.0% -0.0% -0.0% -0.0% cryptarithm1 -0.0% 0.0% -0.0% -0.0% -0.0% cryptarithm2 -0.0% 0.0% -0.0% -0.0% -0.0% cse -0.0% 0.0% -0.0% -0.0% -0.0% digits-of-e1 -0.0% 0.0% -0.0% -0.0% -0.0% digits-of-e2 -0.0% 0.0% -0.0% -0.0% -0.0% dom-lt -0.0% 0.0% -0.0% -0.0% -0.0% eliza -0.0% 0.0% -0.0% -0.0% -0.0% event -0.0% 0.0% -0.0% -0.0% -0.0% exact-reals -0.0% 0.0% -0.0% -0.0% -0.0% exp3_8 -0.0% 0.0% -0.0% -0.0% -0.0% expert -0.0% 0.0% -0.0% -0.0% -0.0% fannkuch-redux -0.0% 0.0% -0.0% -0.0% -0.0% fasta -0.0% 0.0% -0.0% -0.0% -0.0% fem -0.0% 0.0% -0.0% -0.0% -0.0% fft -0.0% 0.0% -0.0% -0.0% -0.0% fft2 -0.0% 0.0% -0.0% -0.0% -0.0% fibheaps -0.0% 0.0% -0.0% -0.0% -0.0% fish -0.0% 0.0% -0.0% -0.0% -0.0% fluid -0.1% 0.0% -0.0% -0.0% -0.0% fulsom -0.0% 0.0% -0.0% -0.0% -0.0% gamteb -0.0% 0.0% -0.0% -0.0% -0.0% gcd -0.0% 0.0% -0.0% -0.0% -0.0% gen_regexps -0.0% 0.0% -0.0% -0.0% -0.0% genfft -0.0% 0.0% -0.0% -0.0% -0.0% gg -0.0% 0.0% -0.0% -0.0% -0.0% grep -0.0% 0.0% -0.0% -0.0% -0.0% hidden -0.0% 0.0% -0.0% -0.0% -0.0% hpg -0.1% 0.0% -0.0% -0.0% -0.0% ida -0.0% 0.0% -0.0% -0.0% -0.0% infer -0.0% 0.0% -0.0% -0.0% -0.0% integer -0.0% 0.0% -0.0% -0.0% -0.0% integrate -0.0% 0.0% -0.0% -0.0% -0.0% k-nucleotide -0.0% 0.0% -0.0% -0.0% -0.0% kahan -0.0% 0.0% -0.0% -0.0% -0.0% knights -0.0% 0.0% -0.0% -0.0% -0.0% lambda -0.0% 0.0% -0.0% -0.0% -0.0% last-piece -0.0% 0.0% -0.0% -0.0% -0.0% lcss -0.0% 0.0% -0.0% -0.0% -0.0% life -0.0% 0.0% -0.0% -0.0% -0.0% lift -0.0% 0.0% -0.0% -0.0% -0.0% linear -0.1% 0.0% -0.0% -0.0% -0.0% listcompr -0.0% 0.0% -0.0% -0.0% -0.0% listcopy -0.0% 0.0% -0.0% -0.0% -0.0% maillist -0.0% 0.0% -0.0% -0.0% -0.0% mandel -0.0% 0.0% -0.0% -0.0% -0.0% mandel2 -0.0% 0.0% -0.0% -0.0% -0.0% mate -0.0% 0.0% -0.0% -0.0% -0.0% minimax -0.0% 0.0% -0.0% -0.0% -0.0% mkhprog -0.0% 0.0% -0.0% -0.0% -0.0% multiplier -0.0% 0.0% -0.0% -0.0% -0.0% n-body -0.0% 0.0% -0.0% -0.0% -0.0% nucleic2 -0.0% 0.0% -0.0% -0.0% -0.0% para -0.0% 0.0% -0.0% -0.0% -0.0% paraffins -0.0% 0.0% -0.0% -0.0% -0.0% parser -0.1% 0.0% -0.0% -0.0% -0.0% parstof -0.1% 0.0% -0.0% -0.0% -0.0% pic -0.0% 0.0% -0.0% -0.0% -0.0% pidigits -0.0% 0.0% -0.0% -0.0% -0.0% power -0.0% 0.0% -0.0% -0.0% -0.0% pretty -0.0% 0.0% -0.3% -0.4% -0.4% primes -0.0% 0.0% -0.0% -0.0% -0.0% primetest -0.0% 0.0% -0.0% -0.0% -0.0% prolog -0.0% 0.0% -0.0% -0.0% -0.0% puzzle -0.0% 0.0% -0.0% -0.0% -0.0% queens -0.0% 0.0% -0.0% -0.0% -0.0% reptile -0.0% 0.0% -0.0% -0.0% -0.0% reverse-complem -0.0% 0.0% -0.0% -0.0% -0.0% rewrite -0.0% 0.0% -0.0% -0.0% -0.0% rfib -0.0% 0.0% -0.0% -0.0% -0.0% rsa -0.0% 0.0% -0.0% -0.0% -0.0% scc -0.0% 0.0% -0.3% -0.5% -0.4% sched -0.0% 0.0% -0.0% -0.0% -0.0% scs -0.0% 0.0% -0.0% -0.0% -0.0% simple -0.1% 0.0% -0.0% -0.0% -0.0% solid -0.0% 0.0% -0.0% -0.0% -0.0% sorting -0.0% 0.0% -0.0% -0.0% -0.0% spectral-norm -0.0% 0.0% -0.0% -0.0% -0.0% sphere -0.0% 0.0% -0.0% -0.0% -0.0% symalg -0.0% 0.0% -0.0% -0.0% -0.0% tak -0.0% 0.0% -0.0% -0.0% -0.0% transform -0.0% 0.0% -0.0% -0.0% -0.0% treejoin -0.0% 0.0% -0.0% -0.0% -0.0% typecheck -0.0% 0.0% -0.0% -0.0% -0.0% veritas -0.0% 0.0% -0.0% -0.0% -0.0% wang -0.0% 0.0% -0.0% -0.0% -0.0% wave4main -0.0% 0.0% -0.0% -0.0% -0.0% wheel-sieve1 -0.0% 0.0% -0.0% -0.0% -0.0% wheel-sieve2 -0.0% 0.0% -0.0% -0.0% -0.0% x2n1 -0.0% 0.0% -0.0% -0.0% -0.0% -------------------------------------------------------------------------------- Min -0.1% 0.0% -0.3% -0.5% -0.5% Max -0.0% 0.0% -0.0% -0.0% -0.0% Geometric Mean -0.0% -0.0% -0.0% -0.0% -0.0% -------------------------------------------------------------------------------- Program Size Allocs Instrs Reads Writes -------------------------------------------------------------------------------- circsim -0.1% 0.0% -0.0% -0.0% -0.0% constraints -0.0% 0.0% -0.0% -0.0% -0.0% fibheaps -0.0% 0.0% -0.0% -0.0% -0.0% gc_bench -0.0% 0.0% -0.0% -0.0% -0.0% hash -0.0% 0.0% -0.0% -0.0% -0.0% lcss -0.0% 0.0% -0.0% -0.0% -0.0% power -0.0% 0.0% -0.0% -0.0% -0.0% spellcheck -0.0% 0.0% -0.0% -0.0% -0.0% -------------------------------------------------------------------------------- Min -0.1% 0.0% -0.0% -0.0% -0.0% Max -0.0% 0.0% -0.0% -0.0% -0.0% Geometric Mean -0.0% +0.0% -0.0% -0.0% -0.0% Manual inspection of programs in testsuite/tests/programs --------------------------------------------------------- I built these programs with a bunch of dump flags and `-O` and compared STG, Cmm, and Asm dumps and file sizes. (Below the numbers in parenthesis show number of modules in the program) These programs have identical compiler (same .hi and .o sizes, STG, and Cmm and Asm dumps): - Queens (1), andre_monad (1), cholewo-eval (2), cvh_unboxing (3), andy_cherry (7), fun_insts (1), hs-boot (4), fast2haskell (2), jl_defaults (1), jq_readsPrec (1), jules_xref (1), jtod_circint (4), jules_xref2 (1), lennart_range (1), lex (1), life_space_leak (1), bargon-mangler-bug (7), record_upd (1), rittri (1), sanders_array (1), strict_anns (1), thurston-module-arith (2), okeefe_neural (1), joao-circular (6), 10queens (1) Programs with different compiler outputs: - jl_defaults (1): For some reason GHC HEAD marks a lot of top-level `[Int]` closures as CAFFY for no reason. With this patch we no longer make them CAFFY and generate less SRT entries. For some reason Main.o is slightly larger with this patch (1.3%) and the executable sizes are the same. (I'd expect both to be smaller) - launchbury (1): Same as jl_defaults: top-level `[Int]` closures marked as CAFFY for no reason. Similarly `Main.o` is 1.4% larger but the executable sizes are the same. - galois_raytrace (13): Differences are in the Parse module. There are a lot, but some of the changes are caused by the fact that for some reason (I think a bug) GHC HEAD marks the dictionary for `Functor Identity` as CAFFY. Parse.o is 0.4% larger, the executable size is the same. - north_array: We now generate less SRT entries because some of array primops used in this program like `NewArrayOp` get eliminated during Stg-to-Cmm and turn some CAFFY things into non-CAFFY. Main.o gets 24% larger (9224 bytes from 9000 bytes), executable sizes are the same. - seward-space-leak: Difference in this program is better shown by this smaller example: module Lib where data CDS = Case [CDS] [(Int, CDS)] | Call CDS CDS instance Eq CDS where Case sels1 rets1 == Case sels2 rets2 = sels1 == sels2 && rets1 == rets2 Call a1 b1 == Call a2 b2 = a1 == a2 && b1 == b2 _ == _ = False In this program GHC HEAD builds a new SRT for the recursive group of `(==)`, `(/=)` and the dictionary closure. Then `/=` points to `==` in its SRT field, and `==` uses the SRT object as its SRT. With this patch we use the closure for `/=` as the SRT and add `==` there. Then `/=` gets an empty SRT field and `==` points to `/=` in its SRT field. This change looks fine to me. Main.o gets 0.07% larger, executable sizes are identical. head.hackage ------------ head.hackage's CI script builds 428 packages from Hackage using this patch with no failures. Compiler performance -------------------- The compiler perf tests report that the compiler allocates slightly more (worst case observed so far is 4%). However most programs in the test suite are small, single file programs. To benchmark compiler performance on something more realistic I build Cabal (the library, 236 modules) with different optimisation levels. For the "max residency" row I run GHC with `+RTS -s -A100k -i0 -h` for more accurate numbers. Other rows are generated with just `-s`. (This is because `-i0` causes running GC much more frequently and as a result "bytes copied" gets inflated by more than 25x in some cases) * -O0 | | GHC HEAD | This MR | Diff | | --------------- | -------------- | -------------- | ------ | | Bytes allocated | 54,413,350,872 | 54,701,099,464 | +0.52% | | Bytes copied | 4,926,037,184 | 4,990,638,760 | +1.31% | | Max residency | 421,225,624 | 424,324,264 | +0.73% | * -O1 | | GHC HEAD | This MR | Diff | | --------------- | --------------- | --------------- | ------ | | Bytes allocated | 245,849,209,992 | 246,562,088,672 | +0.28% | | Bytes copied | 26,943,452,560 | 27,089,972,296 | +0.54% | | Max residency | 982,643,440 | 991,663,432 | +0.91% | * -O2 | | GHC HEAD | This MR | Diff | | --------------- | --------------- | --------------- | ------ | | Bytes allocated | 291,044,511,408 | 291,863,910,912 | +0.28% | | Bytes copied | 37,044,237,616 | 36,121,690,472 | -2.49% | | Max residency | 1,071,600,328 | 1,086,396,256 | +1.38% | Extra compiler allocations -------------------------- Runtime allocations of programs are as reported above (NoFib section). The compiler now allocates more than before. Main source of allocation in this patch compared to base commit is the new SRT algorithm (GHC.Cmm.Info.Build). Below is some of the extra work we do with this patch, numbers generated by profiled stage 2 compiler when building a pathological case (the test 'ManyConstructors') with '-O2': - We now sort the final STG for a module, which means traversing the entire program, generating free variable set for each top-level binding, doing SCC analysis, and re-ordering the program. In ManyConstructors this step allocates 97,889,952 bytes. - We now do SRT analysis on static data, which in a program like ManyConstructors causes analysing 10,000 bindings that we would previously just skip. This step allocates 70,898,352 bytes. - We now maintain an SRT map for the entire module as we compile Cmm groups: data ModuleSRTInfo = ModuleSRTInfo { ... , moduleSRTMap :: SRTMap } (SRTMap is just a strict Map from the 'containers' library) This map gets an entry for most bindings in a module (exceptions are THUNKs and CAFFY static functions). For ManyConstructors this map gets 50015 entries. - Once we're done with code generation we generate a NameSet from SRTMap for the non-CAFFY names in the current module. This set gets the same number of entries as the SRTMap. - Finally we update CafInfos in ModDetails for the non-CAFFY Ids, using the NameSet generated in the previous step. This usually does the least amount of allocation among the work listed here. Only place with this patch where we do less work in the CAF analysis in the tidying pass (CoreTidy). However that doesn't save us much, as the pass still needs to traverse the whole program and update IdInfos for other reasons. Only thing we don't here do is the `hasCafRefs` pass over the RHS of bindings, which is a stateless pass that returns a boolean value, so it doesn't allocate much. (Metric changes blow are all increased allocations) Metric changes -------------- Metric Increase: ManyAlternatives ManyConstructors T13035 T14683 T1969 T9961
* Module hierarchy: Cmm (cf #13009)Sylvain Henry2020-01-251-5/+5
|
* Module hierarchy (#13009): StgSylvain Henry2019-12-311-2/+2
|
* Rename GHC.StgToCmm.Con -> GHC.StgToCmm.DataConBen Gamari2019-09-151-0/+285
Incredibly, Windows disallows the manipulation of any file matching Con(\..*)?. The `GHC.StgToCmm.Con` was introduced in the renamings in 447864a9, breaking the Windows build. Work around this by renaming it to `GHC.StgToCmm.DataCon` Fixes #17187.