summaryrefslogtreecommitdiff
path: root/compiler/GHC/StgToCmm/Closure.hs
Commit message (Collapse)AuthorAgeFilesLines
* Ticky profiling improvements.Matthew Pickering2022-03-021-3/+4
| | | | | | | | | | | | | | | | | | | | This adds a number of changes to ticky-ticky profiling. When an executable is profiled with IPE profiling it's now possible to associate id-related ticky counters to their source location. This works by emitting the info table address as part of the counter which can be looked up in the IPE table. Add a `-ticky-ap-thunk` flag. This flag prevents the use of some standard thunks which are precompiled into the RTS. This means reduced cache locality and increased code size. But it allows better attribution of execution cost to specific source locations instead of simple attributing it to the standard thunk. ticky-ticky now uses the `arg` field to emit additional information about counters in json format. When ticky-ticky is used in combination with the eventlog eventlog2html can be used to generate a html table from the eventlog similar to the old text output for ticky-ticky.
* CLabel cleanup:Andreas Klebinger2022-02-281-11/+2
| | | | | | | | Remove these smart constructors for these reasons: * mkLocalClosureTableLabel : Does the same as the non-local variant. * mkLocalClosureLabel : Does the same as the non-local variant. * mkLocalInfoTableLabel : Decide if we make a local label based on the name and just use mkInfoTableLabel everywhere.
* Tag inference work.Andreas Klebinger2022-02-121-9/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This does three major things: * Enforce the invariant that all strict fields must contain tagged pointers. * Try to predict the tag on bindings in order to omit tag checks. * Allows functions to pass arguments unlifted (call-by-value). The former is "simply" achieved by wrapping any constructor allocations with a case which will evaluate the respective strict bindings. The prediction is done by a new data flow analysis based on the STG representation of a program. This also helps us to avoid generating redudant cases for the above invariant. StrictWorkers are created by W/W directly and SpecConstr indirectly. See the Note [Strict Worker Ids] Other minor changes: * Add StgUtil module containing a few functions needed by, but not specific to the tag analysis. ------------------------- Metric Decrease: T12545 T18698b T18140 T18923 LargeRecord Metric Increase: LargeRecord ManyAlternatives ManyConstructors T10421 T12425 T12707 T13035 T13056 T13253 T13253-spj T13379 T15164 T18282 T18304 T18698a T1969 T20049 T3294 T4801 T5321FD T5321Fun T783 T9233 T9675 T9961 T19695 WWRec -------------------------
* StgToCmm: decouple DynFlags, add StgToCmmConfigdoyougnu2022-01-311-24/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | StgToCmm: add Config, remove CgInfoDownwards StgToCmm: runC api change to take StgToCmmConfig StgToCmm: CgInfoDownad -> StgToCmmConfig StgToCmm.Monad: update getters/setters/withers StgToCmm: remove CallOpts in StgToCmm.Closure StgToCmm: remove dynflag references StgToCmm: PtrOpts removed StgToCmm: add TMap to config, Prof - dynflags StgToCmm: add omit yields to config StgToCmm.ExtCode: remove redundant import StgToCmm.Heap: remove references to dynflags StgToCmm: codeGen api change, DynFlags -> Config StgToCmm: remove dynflags in Env and StgToCmm StgToCmm.DataCon: remove dynflags references StgToCmm: remove dynflag references in DataCon StgToCmm: add backend avx flags to config StgToCmm.Prim: remove dynflag references StgToCmm.Expr: remove dynflag references StgToCmm.Bind: remove references to dynflags StgToCmm: move DoAlignSanitisation to Cmm.Type StgToCmm: remove PtrOpts in Cmm.Parser.y DynFlags: update ipInitCode api StgToCmm: Config Module is single source of truth StgToCmm: Lazy config breaks IORef deadlock testsuite: bump countdeps threshold StgToCmm.Config: strictify fields except UpdFrame Strictifying UpdFrameOffset causes the RTS build with stage1 to deadlock. Additionally, before the deadlock performance of the RTS is noticeably slower. StgToCmm.Config: add field descriptions StgToCmm: revert strictify on Module in config testsuite: update CountDeps tests StgToCmm: update comment, fix exports Specifically update comment about loopification passed into dynflags then stored into stgToCmmConfig. And remove getDynFlags from Monad.hs exports Types.Name: add pprFullName function StgToCmm.Ticky: use pprFullname, fixup ExtCode imports Cmm.Info: revert cmmGetClosureType removal StgToCmm.Bind: use pprFullName, Config update comments StgToCmm: update closureDescription api StgToCmm: SAT altHeapCheck StgToCmm: default render for Info table, ticky Use default rendering contexts for info table and ticky ticky, which should be independent of command line input. testsuite: bump count deps pprFullName: flag for ticky vs normal style output convertInfoProvMap: remove unused parameter StgToCmm.Config: add backend flags to config StgToCmm.Config: remove Backend from Config StgToCmm.Prim: refactor Backend call sites StgToCmm.Prim: remove redundant imports StgToCmm.Config: refactor vec compatibility check StgToCmm.Config: add allowQuotRem2 flag StgToCmm.Ticky: print internal names with parens StgToCmm.Bind: dispatch ppr based on externality StgToCmm: Add pprTickyname, Fix ticky naming Accidently removed the ctx for ticky SDoc output. The only relevant flag is sdocPprDebug which was accidental set to False due to using defaultSDocContext without altering the flag. StgToCmm: remove stateful fields in config fixup: config: remove redundant imports StgToCmm: move Sequel type to its own module StgToCmm: proliferate getCallMethod updated api StgToCmm.Monad: add FCodeState to Monad Api StgToCmm: add second reader monad to FCode fixup: Prim.hs: missed a merge conflict fixup: Match countDeps tests to HEAD StgToCmm.Monad: withState -> withCgState To disambiguate it from mtl withState. This withState shouldn't be returning the new state as a value. However, fixing this means tackling the knot tying in CgState and so is very difficult since it changes when the thunk of the knot is forced which either leads to deadlock or to compiler panic.
* Increase type sharingBen Gamari2021-11-171-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes #20541 by making mkTyConApp do more sharing of types. In particular, replace * BoxedRep Lifted ==> LiftedRep * BoxedRep Unlifted ==> UnliftedRep * TupleRep '[] ==> ZeroBitRep * TYPE ZeroBitRep ==> ZeroBitType In each case, the thing on the right is a type synonym for the thing on the left, declared in ghc-prim:GHC.Types. See Note [Using synonyms to compress types] in GHC.Core.Type. The synonyms for ZeroBitRep and ZeroBitType are new, but absolutely in the same spirit as the other ones. (These synonyms are mainly for internal use, though the programmer can use them too.) I also renamed GHC.Core.Ty.Rep.isVoidTy to isZeroBitTy, to be compatible with the "zero-bit" nomenclature above. See discussion on !6806. There is a tricky wrinkle: see GHC.Core.Types Note [Care using synonyms to compress types] Compiler allocation decreases by up to 0.8%.
* Support unlifted datatypes in GHCiLuite Stegeman2021-07-021-0/+4
| | | | fixes #19628
* Remove useless {-# LANGUAGE CPP #-} pragmasSylvain Henry2021-05-121-1/+1
|
* Fully remove HsVersions.hSylvain Henry2021-05-121-2/+0
| | | | | | | | | | Replace uses of WARN macro with calls to: warnPprTrace :: Bool -> SDoc -> a -> a Remove the now unused HsVersions.h Bump haddock submodule
* Replace CPP assertions with Haskell functionsSylvain Henry2021-05-121-9/+10
| | | | | | | | | | | | | | | There is no reason to use CPP. __LINE__ and __FILE__ macros are now better replaced with GHC's CallStack. As a bonus, assert error messages now contain more information (function name, column). Here is the mapping table (HasCallStack omitted): * ASSERT: assert :: Bool -> a -> a * MASSERT: massert :: Bool -> m () * ASSERTM: assertM :: m Bool -> m () * ASSERT2: assertPpr :: Bool -> SDoc -> a -> a * MASSERT2: massertPpr :: Bool -> SDoc -> m () * ASSERTM2: assertPprM :: m Bool -> SDoc -> m ()
* Add option to give each usage of a data constructor its own info tableMatthew Pickering2021-03-031-4/+3
| | | | | | | | | | | | | The `-fdistinct-constructor-tables` flag will generate a fresh info table for the usage of any data constructor. This is useful for debugging as now by inspecting the info table, you can determine which usage of a constructor caused that allocation rather than the old situation where the info table always mapped to the definition site of the data constructor which is useless. In conjunction with `-hi` and `-finfo-table-map` this gives a more fine grained understanding of where constructor allocations arise from in a program.
* Add -finfo-table-map which maps info tables to source positionsMatthew Pickering2021-03-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This new flag embeds a lookup table from the address of an info table to information about that info table. The main interface for consulting the map is the `lookupIPE` C function > InfoProvEnt * lookupIPE(StgInfoTable *info) The `InfoProvEnt` has the following structure: > typedef struct InfoProv_{ > char * table_name; > char * closure_desc; > char * ty_desc; > char * label; > char * module; > char * srcloc; > } InfoProv; > > typedef struct InfoProvEnt_ { > StgInfoTable * info; > InfoProv prov; > struct InfoProvEnt_ *link; > } InfoProvEnt; The source positions are approximated in a similar way to the source positions for DWARF debugging information. They are only approximate but in our experience provide a good enough hint about where the problem might be. It is therefore recommended to use this flag in conjunction with `-g<n>` for more accurate locations. The lookup table is also emitted into the eventlog when it is available as it is intended to be used with the `-hi` profiling mode. Using this flag will significantly increase the size of the resulting object file but only by a factor of 2-3x in our experience.
* The Char kind (#11342)Daniel Rogozin2021-02-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | Co-authored-by: Rinat Stryungis <rinat.stryungis@serokell.io> Implement GHC Proposal #387 * Parse char literals 'x' at the type level * New built-in type families CmpChar, ConsSymbol, UnconsSymbol * New KnownChar class (cf. KnownSymbol and KnownNat) * New SomeChar type (cf. SomeSymbol and SomeNat) * CharTyLit support in template-haskell Updated submodules: binary, haddock. Metric Decrease: T5205 haddock.base Metric Increase: Naperian T13035
* Generalize OutputablePSylvain Henry2020-09-171-1/+4
| | | | | Add a type parameter for the environment required by OutputableP. It avoids tying Platform with OutputableP.
* Introduce OutputablePSylvain Henry2020-09-171-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | Some types need a Platform value to be pretty-printed: CLabel, Cmm types, instructions, etc. Before this patch they had an Outputable instance and the Platform value was obtained via sdocWithDynFlags. It meant that the *renderer* of the SDoc was responsible of passing the appropriate Platform value (e.g. via the DynFlags given to showSDoc). It put the burden of passing the Platform value on the renderer while the generator of the SDoc knows the Platform it is generating the SDoc for and there is no point passing a different Platform at rendering time. With this patch, we introduce a new OutputableP class: class OutputableP a where pdoc :: Platform -> a -> SDoc With this class we still have some polymorphism as we have with `ppr` (i.e. we can use `pdoc` on a variety of types instead of having a dedicated `pprXXX` function for each XXX type). One step closer removing `sdocWithDynFlags` (#10143) and supporting several platforms (#14335).
* DynFlags: disentangle OutputableSylvain Henry2020-08-121-0/+1
| | | | | | | | | - put panic related functions into GHC.Utils.Panic - put trace related functions using DynFlags in GHC.Driver.Ppr One step closer making Outputable fully independent of DynFlags. Bump haddock submodule
* Refactor CLabel pretty-printingSylvain Henry2020-07-311-6/+6
| | | | | | | | Pretty-printing CLabel relies on sdocWithDynFlags that we want to remove (#10143, #17957). It uses it to query the backend and the platform. This patch exposes Clabel ppr functions specialised for each backend so that backend code can directly use them.
* Remove platform constant wrappersSylvain Henry2020-07-251-69/+72
| | | | | | | | | | Platform constant wrappers took a DynFlags parameter, hence implicitly used the target platform constants. We removed them to allow support for several platforms at once (#14335) and to avoid having to pass the full DynFlags to every function (#17957). Metric Decrease: T4801
* Replace Opt_SccProfilingOn flag with sccProfilingEnabled helper functionSylvain Henry2020-07-021-5/+5
| | | | | | | | | | | SCC profiling was enabled in a convoluted way: if WayProf was enabled, Opt_SccProfilingOn general flag was set (in `GHC.Driver.Ways.wayGeneralFlags`), and then this flag was queried in various places. There is no need to go via general flags, so this patch defines a `sccProfilingEnabled :: DynFlags -> Bool` helper function that just checks whether WayProf is enabled.
* Clean up haddock hyperlinks of GHC.* (part2)Takenobu Tani2020-06-251-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | This updates haddock comments only. This patch focuses to update for hyperlinks in GHC API's haddock comments, because broken links especially discourage newcomers. This includes the following hierarchies: - GHC.Iface.* - GHC.Llvm.* - GHC.Rename.* - GHC.Tc.* - GHC.HsToCore.* - GHC.StgToCmm.* - GHC.CmmToAsm.* - GHC.Runtime.* - GHC.Unit.* - GHC.Utils.* - GHC.SysTools.*
* Move tablesNextToCode field into PlatformSylvain Henry2020-06-181-14/+19
| | | | | | | tablesNextToCode is a platform setting and doesn't belong into DynFlags (#17957). Doing this is also a prerequisite to fix #14335 where we deal with two platforms (target and host) that may have different platform settings.
* Move CLabel assertions into smart constructors (#17957)Sylvain Henry2020-06-171-9/+9
| | | | | It avoids using DynFlags in the Outputable instance of Clabel to check assertions at pretty-printing time.
* Cross-module LambdaFormInfo passingÖmer Sinan Ağacan2020-06-101-82/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Store LambdaFormInfos of exported Ids in interface files - Use them in importing modules This is for optimization purposes: if we know LambdaFormInfo of imported Ids we can generate more efficient calling code, see `getCallMethod`. Exporting (putting them in interface files or in ModDetails) and importing (reading them from interface files) are both optional. We don't assume known LambdaFormInfos anywhere and do not change how we call Ids with unknown LambdaFormInfos. Runtime, allocation, and residency numbers when building Cabal-the-library (commit 0d4ee7ba3): (Log and .hp files are in the MR: !2842) | | GHC HEAD | This patch | Diff | |-----|----------|------------|----------------| | -O0 | 0:35.89 | 0:34.10 | -1.78s, -4.98% | | -O1 | 2:24.01 | 2:23.62 | -0.39s, -0.27% | | -O2 | 2:52.23 | 2:51.35 | -0.88s, -0.51% | | | GHC HEAD | This patch | Diff | |-----|-----------------|-----------------|----------------------------| | -O0 | 54,843,608,416 | 54,878,769,544 | +35,161,128 bytes, +0.06% | | -O1 | 227,136,076,400 | 227,569,045,168 | +432,968,768 bytes, +0.19% | | -O2 | 266,147,063,296 | 266,749,643,440 | +602,580,144 bytes, +0.22% | NOTE: Residency is measured with extra runtime args: `-i0 -h` which effectively turn all GCs into major GCs, and do GC more often. | | GHC HEAD | This patch | Diff | |-----|----------------------------|------------------------------|----------------------------| | -O0 | 410,284,000 (910 samples) | 411,745,008 (906 samples) | +1,461,008 bytes, +0.35% | | -O1 | 928,580,856 (2109 samples) | 943,506,552 (2103 samples) | +14,925,696 bytes, +1.60% | | -O2 | 993,951,352 (2549 samples) | 1,010,156,328 (2545 samples) | +16,204,9760 bytes, +1.63% | NoFib results: -------------------------------------------------------------------------------- Program Size Allocs Instrs Reads Writes -------------------------------------------------------------------------------- CS 0.0% 0.0% +0.0% +0.0% +0.0% CSD 0.0% 0.0% 0.0% +0.0% +0.0% FS 0.0% 0.0% +0.0% +0.0% +0.0% S 0.0% 0.0% +0.0% +0.0% +0.0% VS 0.0% 0.0% +0.0% +0.0% +0.0% VSD 0.0% 0.0% +0.0% +0.0% +0.1% VSM 0.0% 0.0% +0.0% +0.0% +0.0% anna 0.0% 0.0% -0.3% -0.8% -0.0% ansi 0.0% 0.0% -0.0% -0.0% 0.0% atom 0.0% 0.0% -0.0% -0.0% 0.0% awards 0.0% 0.0% -0.1% -0.3% 0.0% banner 0.0% 0.0% -0.0% -0.0% -0.0% bernouilli 0.0% 0.0% -0.0% -0.0% -0.0% binary-trees 0.0% 0.0% -0.0% -0.0% +0.0% boyer 0.0% 0.0% -0.0% -0.0% 0.0% boyer2 0.0% 0.0% -0.0% -0.0% 0.0% bspt 0.0% 0.0% -0.0% -0.2% 0.0% cacheprof 0.0% 0.0% -0.1% -0.4% +0.0% calendar 0.0% 0.0% -0.0% -0.0% 0.0% cichelli 0.0% 0.0% -0.9% -2.4% 0.0% circsim 0.0% 0.0% -0.0% -0.0% 0.0% clausify 0.0% 0.0% -0.1% -0.3% 0.0% comp_lab_zift 0.0% 0.0% -0.0% -0.0% +0.0% compress 0.0% 0.0% -0.0% -0.0% -0.0% compress2 0.0% 0.0% -0.0% -0.0% 0.0% constraints 0.0% 0.0% -0.1% -0.2% -0.0% cryptarithm1 0.0% 0.0% -0.0% -0.0% 0.0% cryptarithm2 0.0% 0.0% -1.4% -4.1% -0.0% cse 0.0% 0.0% -0.0% -0.0% -0.0% digits-of-e1 0.0% 0.0% -0.0% -0.0% -0.0% digits-of-e2 0.0% 0.0% -0.0% -0.0% -0.0% dom-lt 0.0% 0.0% -0.1% -0.2% 0.0% eliza 0.0% 0.0% -0.5% -1.5% 0.0% event 0.0% 0.0% -0.0% -0.0% -0.0% exact-reals 0.0% 0.0% -0.1% -0.3% +0.0% exp3_8 0.0% 0.0% -0.0% -0.0% -0.0% expert 0.0% 0.0% -0.3% -1.0% -0.0% fannkuch-redux 0.0% 0.0% +0.0% +0.0% +0.0% fasta 0.0% 0.0% -0.0% -0.0% +0.0% fem 0.0% 0.0% -0.0% -0.0% 0.0% fft 0.0% 0.0% -0.0% -0.0% 0.0% fft2 0.0% 0.0% -0.0% -0.0% 0.0% fibheaps 0.0% 0.0% -0.0% -0.0% +0.0% fish 0.0% 0.0% 0.0% -0.0% +0.0% fluid 0.0% 0.0% -0.4% -1.2% +0.0% fulsom 0.0% 0.0% -0.0% -0.0% 0.0% gamteb 0.0% 0.0% -0.1% -0.3% 0.0% gcd 0.0% 0.0% -0.0% -0.0% 0.0% gen_regexps 0.0% 0.0% -0.0% -0.0% -0.0% genfft 0.0% 0.0% -0.0% -0.0% 0.0% gg 0.0% 0.0% -0.0% -0.0% +0.0% grep 0.0% 0.0% -0.0% -0.0% -0.0% hidden 0.0% 0.0% -0.1% -0.4% -0.0% hpg 0.0% 0.0% -0.2% -0.5% +0.0% ida 0.0% 0.0% -0.0% -0.0% +0.0% infer 0.0% 0.0% -0.3% -0.8% -0.0% integer 0.0% 0.0% -0.0% -0.0% +0.0% integrate 0.0% 0.0% -0.0% -0.0% 0.0% k-nucleotide 0.0% 0.0% -0.0% -0.0% +0.0% kahan 0.0% 0.0% -0.0% -0.0% +0.0% knights 0.0% 0.0% -2.2% -5.4% 0.0% lambda 0.0% 0.0% -0.6% -1.8% 0.0% last-piece 0.0% 0.0% -0.0% -0.0% 0.0% lcss 0.0% 0.0% -0.0% -0.1% 0.0% life 0.0% 0.0% -0.0% -0.1% 0.0% lift 0.0% 0.0% -0.2% -0.6% +0.0% linear 0.0% 0.0% -0.0% -0.0% -0.0% listcompr 0.0% 0.0% -0.0% -0.0% 0.0% listcopy 0.0% 0.0% -0.0% -0.0% 0.0% maillist 0.0% 0.0% -0.1% -0.3% +0.0% mandel 0.0% 0.0% -0.0% -0.0% 0.0% mandel2 0.0% 0.0% -0.0% -0.0% -0.0% mate +0.0% 0.0% -0.0% -0.0% -0.0% minimax 0.0% 0.0% -0.2% -1.0% 0.0% mkhprog 0.0% 0.0% -0.1% -0.2% -0.0% multiplier 0.0% 0.0% -0.0% -0.0% -0.0% n-body 0.0% 0.0% -0.0% -0.0% +0.0% nucleic2 0.0% 0.0% -0.1% -0.2% 0.0% para 0.0% 0.0% -0.0% -0.0% -0.0% paraffins 0.0% 0.0% -0.0% -0.0% 0.0% parser 0.0% 0.0% -0.2% -0.7% 0.0% parstof 0.0% 0.0% -0.0% -0.0% +0.0% pic 0.0% 0.0% -0.0% -0.0% 0.0% pidigits 0.0% 0.0% +0.0% +0.0% +0.0% power 0.0% 0.0% -0.2% -0.6% +0.0% pretty 0.0% 0.0% -0.0% -0.0% -0.0% primes 0.0% 0.0% -0.0% -0.0% 0.0% primetest 0.0% 0.0% -0.0% -0.0% -0.0% prolog 0.0% 0.0% -0.3% -1.1% 0.0% puzzle 0.0% 0.0% -0.0% -0.0% 0.0% queens 0.0% 0.0% -0.0% -0.0% +0.0% reptile 0.0% 0.0% -0.0% -0.0% 0.0% reverse-complem 0.0% 0.0% -0.0% -0.0% +0.0% rewrite 0.0% 0.0% -0.7% -2.5% -0.0% rfib 0.0% 0.0% -0.0% -0.0% 0.0% rsa 0.0% 0.0% -0.0% -0.0% 0.0% scc 0.0% 0.0% -0.1% -0.2% -0.0% sched 0.0% 0.0% -0.0% -0.0% -0.0% scs 0.0% 0.0% -1.0% -2.6% +0.0% simple 0.0% 0.0% +0.0% -0.0% +0.0% solid 0.0% 0.0% -0.0% -0.0% 0.0% sorting 0.0% 0.0% -0.6% -1.6% 0.0% spectral-norm 0.0% 0.0% +0.0% 0.0% +0.0% sphere 0.0% 0.0% -0.0% -0.0% -0.0% symalg 0.0% 0.0% -0.0% -0.0% +0.0% tak 0.0% 0.0% -0.0% -0.0% 0.0% transform 0.0% 0.0% -0.0% -0.0% 0.0% treejoin 0.0% 0.0% -0.0% -0.0% 0.0% typecheck 0.0% 0.0% -0.0% -0.0% +0.0% veritas +0.0% 0.0% -0.2% -0.4% +0.0% wang 0.0% 0.0% -0.0% -0.0% 0.0% wave4main 0.0% 0.0% -0.0% -0.0% -0.0% wheel-sieve1 0.0% 0.0% -0.0% -0.0% -0.0% wheel-sieve2 0.0% 0.0% -0.0% -0.0% +0.0% x2n1 0.0% 0.0% -0.0% -0.0% -0.0% -------------------------------------------------------------------------------- Min 0.0% 0.0% -2.2% -5.4% -0.0% Max +0.0% 0.0% +0.0% +0.0% +0.1% Geometric Mean -0.0% -0.0% -0.1% -0.3% +0.0% Metric increases micro benchmarks tracked in #17686: Metric Increase: T12150 T12234 T12425 T13035 T5837 T6048 T9233 Co-authored-by: Andreas Klebinger <klebinger.andreas@gmx.at>
* Clarify leaf module names for new module hierarchyTakenobu Tani2020-06-101-1/+1
| | | | | | | | | | | | | | | | | | | | | This updates comments only. This patch replaces leaf module names according to new module hierarchy [1][2] as followings: * Expand leaf names to easily find the module path: for instance, `Id.hs` to `GHC.Types.Id`. * Modify leaf names according to new module hierarchy: for instance, `Convert.hs` to `GHC.ThToHs`. * Fix typo: for instance, `GHC.Core.TyCo.Rep.hs` to `GHC.Core.TyCo.Rep` See also !3375 [1]: https://gitlab.haskell.org/ghc/ghc/-/wikis/Make-GHC-codebase-more-modular [2]: https://gitlab.haskell.org/ghc/ghc/issues/13009
* Remove OneShotInfo field of LFReEntrant, document OneShotInfoÖmer Sinan Ağacan2020-04-301-25/+18
| | | | | | | | | | | The field is only used in withNewTickyCounterFun and it's easier to directly pass a parameter for one-shot info to withNewTickyCounterFun instead of passing it via LFReEntrant. This also makes !2842 simpler. Other changes: - New Note (by SPJ) [OneShotInfo overview] added. - Arity argument of thunkCode removed as it's always 0.
* Modules: Utils and Data (#13009)Sylvain Henry2020-04-261-3/+3
| | | | | | | Update Haddock submodule Metric Increase: haddock.compiler
* Modules: type-checker (#13009)Sylvain Henry2020-04-071-2/+2
| | | | Update Haddock submodule
* Modules: Types (#13009)Sylvain Henry2020-03-291-5/+5
| | | | | | | Update Haddock submodule Metric Increase: haddock.compiler
* Modules: Core (#13009)Sylvain Henry2020-03-161-4/+4
| | | | Update submodule: haddock
* Modules: Driver (#13009)Sylvain Henry2020-02-211-1/+1
| | | | submodule updates: nofib, haddock
* Module hierarchy: ByteCode and Runtime (cf #13009)Sylvain Henry2020-02-121-1/+1
| | | | Update haddock submodule
* Module hierarchy: Cmm (cf #13009)Sylvain Henry2020-01-251-5/+5
|
* Module hierarchy (#13009): StgSylvain Henry2019-12-311-5/+5
|
* Implement pointer tagging for big families (#14373)Gabor Greif2019-12-061-7/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Formerly we punted on these and evaluated constructors always got a tag of 1. We now cascade switches because we have to check the tag first and when it is MAX_PTR_TAG then get the precise tag from the info table and switch on that. The only technically tricky part is that the default case needs (logical) duplication. To do this we emit an extra label for it and branch to that from the second switch. This avoids duplicated codegen. Here's a simple example of the new code gen: data D = D1 | D2 | D3 | D4 | D5 | D6 | D7 | D8 On a 64-bit system previously all constructors would be tagged 1. With the new code gen D7 and D8 are tagged 7: [Lib.D7_con_entry() { ... {offset c1eu: // global R1 = R1 + 7; call (P64[Sp])(R1) args: 8, res: 0, upd: 8; } }] [Lib.D8_con_entry() { ... {offset c1ez: // global R1 = R1 + 7; call (P64[Sp])(R1) args: 8, res: 0, upd: 8; } }] When switching we now look at the info table only when the tag is 7. For example, if we derive Enum for the type above, the Cmm looks like this: c2Le: _s2Js::P64 = R1; _c2Lq::P64 = _s2Js::P64 & 7; switch [1 .. 7] _c2Lq::P64 { case 1 : goto c2Lk; case 2 : goto c2Ll; case 3 : goto c2Lm; case 4 : goto c2Ln; case 5 : goto c2Lo; case 6 : goto c2Lp; case 7 : goto c2Lj; } // Read info table for tag c2Lj: _c2Lv::I64 = %MO_UU_Conv_W32_W64(I32[I64[_s2Js::P64 & (-8)] - 4]); if (_c2Lv::I64 != 6) goto c2Lu; else goto c2Lt; Generated Cmm sizes do not change too much, but binaries are very slightly larger, due to the fact that the new instructions are longer in encoded form. E.g. previously entry code for D8 above would be 00000000000001c0 <Lib_D8_con_info>: 1c0: 48 ff c3 inc %rbx 1c3: ff 65 00 jmpq *0x0(%rbp) With this patch 00000000000001d0 <Lib_D8_con_info>: 1d0: 48 83 c3 07 add $0x7,%rbx 1d4: ff 65 00 jmpq *0x0(%rbp) This is one byte longer. Secondly, reading info table directly and then switching is shorter _c1co: movq -1(%rbx),%rax movl -4(%rax),%eax // Switch on info table tag jmp *_n1d5(,%rax,8) than doing the same switch, and then for the tag 7 doing another switch: // When tag is 7 _c1ct: andq $-8,%rbx movq (%rbx),%rax movl -4(%rax),%eax // Switch on info table tag ... Some changes of binary sizes in actual programs: - In NoFib the worst case is 0.1% increase in benchmark "parser" (see NoFib results below). All programs get slightly larger. - Stage 2 compiler size does not change. - In "containers" (the library) size of all object files increases 0.0005%. Size of the test program "bitqueue-properties" increases 0.03%. nofib benchmarks kindly provided by Ömer (@osa1): NoFib Results ============= -------------------------------------------------------------------------------- Program Size Allocs Instrs Reads Writes -------------------------------------------------------------------------------- CS +0.0% 0.0% -0.0% -0.0% -0.0% CSD +0.0% 0.0% 0.0% +0.0% +0.0% FS +0.0% 0.0% 0.0% +0.0% 0.0% S +0.0% 0.0% -0.0% 0.0% 0.0% VS +0.0% 0.0% -0.0% +0.0% +0.0% VSD +0.0% 0.0% -0.0% +0.0% -0.0% VSM +0.0% 0.0% 0.0% 0.0% 0.0% anna +0.0% 0.0% +0.1% -0.9% -0.0% ansi +0.0% 0.0% -0.0% +0.0% +0.0% atom +0.0% 0.0% 0.0% 0.0% 0.0% awards +0.0% 0.0% -0.0% +0.0% 0.0% banner +0.0% 0.0% -0.0% +0.0% 0.0% bernouilli +0.0% 0.0% +0.0% +0.0% +0.0% binary-trees +0.0% 0.0% -0.0% -0.0% -0.0% boyer +0.0% 0.0% +0.0% 0.0% -0.0% boyer2 +0.0% 0.0% +0.0% 0.0% -0.0% bspt +0.0% 0.0% +0.0% +0.0% 0.0% cacheprof +0.0% 0.0% +0.1% -0.8% 0.0% calendar +0.0% 0.0% -0.0% +0.0% -0.0% cichelli +0.0% 0.0% +0.0% 0.0% 0.0% circsim +0.0% 0.0% -0.0% -0.1% -0.0% clausify +0.0% 0.0% +0.0% +0.0% 0.0% comp_lab_zift +0.0% 0.0% +0.0% 0.0% -0.0% compress +0.0% 0.0% +0.0% +0.0% 0.0% compress2 +0.0% 0.0% 0.0% 0.0% 0.0% constraints +0.0% 0.0% -0.0% -0.0% -0.0% cryptarithm1 +0.0% 0.0% +0.0% 0.0% 0.0% cryptarithm2 +0.0% 0.0% +0.0% -0.0% 0.0% cse +0.0% 0.0% +0.0% +0.0% 0.0% digits-of-e1 +0.0% 0.0% -0.0% -0.0% -0.0% digits-of-e2 +0.0% 0.0% +0.0% -0.0% -0.0% dom-lt +0.0% 0.0% +0.0% +0.0% 0.0% eliza +0.0% 0.0% -0.0% +0.0% 0.0% event +0.0% 0.0% -0.0% -0.0% -0.0% exact-reals +0.0% 0.0% +0.0% +0.0% +0.0% exp3_8 +0.0% 0.0% -0.0% -0.0% -0.0% expert +0.0% 0.0% +0.0% +0.0% +0.0% fannkuch-redux +0.0% 0.0% +0.0% 0.0% 0.0% fasta +0.0% 0.0% -0.0% -0.0% -0.0% fem +0.0% 0.0% +0.0% +0.0% +0.0% fft +0.0% 0.0% +0.0% -0.0% -0.0% fft2 +0.0% 0.0% +0.0% +0.0% +0.0% fibheaps +0.0% 0.0% +0.0% +0.0% 0.0% fish +0.0% 0.0% +0.0% +0.0% 0.0% fluid +0.0% 0.0% +0.0% +0.0% +0.0% fulsom +0.0% 0.0% +0.0% -0.0% +0.0% gamteb +0.0% 0.0% +0.0% -0.0% -0.0% gcd +0.0% 0.0% +0.0% +0.0% 0.0% gen_regexps +0.0% 0.0% +0.0% -0.0% -0.0% genfft +0.0% 0.0% -0.0% -0.0% -0.0% gg +0.0% 0.0% 0.0% -0.0% 0.0% grep +0.0% 0.0% +0.0% +0.0% +0.0% hidden +0.0% 0.0% +0.0% -0.0% -0.0% hpg +0.0% 0.0% +0.0% -0.1% -0.0% ida +0.0% 0.0% +0.0% -0.0% -0.0% infer +0.0% 0.0% -0.0% -0.0% -0.0% integer +0.0% 0.0% -0.0% -0.0% -0.0% integrate +0.0% 0.0% 0.0% +0.0% 0.0% k-nucleotide +0.0% 0.0% -0.0% -0.0% -0.0% kahan +0.0% 0.0% -0.0% -0.0% -0.0% knights +0.0% 0.0% +0.0% -0.0% -0.0% lambda +0.0% 0.0% +1.2% -6.1% -0.0% last-piece +0.0% 0.0% +0.0% -0.0% -0.0% lcss +0.0% 0.0% +0.0% -0.0% -0.0% life +0.0% 0.0% +0.0% -0.0% -0.0% lift +0.0% 0.0% +0.0% +0.0% 0.0% linear +0.0% 0.0% +0.0% +0.0% +0.0% listcompr +0.0% 0.0% -0.0% -0.0% -0.0% listcopy +0.0% 0.0% -0.0% -0.0% -0.0% maillist +0.0% 0.0% +0.0% -0.0% -0.0% mandel +0.0% 0.0% +0.0% +0.0% +0.0% mandel2 +0.0% 0.0% +0.0% +0.0% -0.0% mate +0.0% 0.0% +0.0% +0.0% +0.0% minimax +0.0% 0.0% -0.0% +0.0% -0.0% mkhprog +0.0% 0.0% +0.0% +0.0% +0.0% multiplier +0.0% 0.0% 0.0% +0.0% -0.0% n-body +0.0% 0.0% +0.0% -0.0% -0.0% nucleic2 +0.0% 0.0% +0.0% +0.0% -0.0% para +0.0% 0.0% +0.0% +0.0% +0.0% paraffins +0.0% 0.0% +0.0% +0.0% +0.0% parser +0.1% 0.0% +0.4% -1.7% -0.0% parstof +0.0% 0.0% -0.0% -0.0% -0.0% pic +0.0% 0.0% +0.0% 0.0% -0.0% pidigits +0.0% 0.0% -0.0% -0.0% -0.0% power +0.0% 0.0% +0.0% -0.0% -0.0% pretty +0.0% 0.0% +0.0% +0.0% +0.0% primes +0.0% 0.0% +0.0% 0.0% 0.0% primetest +0.0% 0.0% +0.0% +0.0% +0.0% prolog +0.0% 0.0% +0.0% +0.0% +0.0% puzzle +0.0% 0.0% +0.0% +0.0% +0.0% queens +0.0% 0.0% 0.0% +0.0% +0.0% reptile +0.0% 0.0% +0.0% +0.0% 0.0% reverse-complem +0.0% 0.0% -0.0% -0.0% -0.0% rewrite +0.0% 0.0% +0.0% 0.0% -0.0% rfib +0.0% 0.0% +0.0% +0.0% +0.0% rsa +0.0% 0.0% +0.0% +0.0% +0.0% scc +0.0% 0.0% +0.0% +0.0% +0.0% sched +0.0% 0.0% +0.0% +0.0% +0.0% scs +0.0% 0.0% +0.0% +0.0% 0.0% simple +0.0% 0.0% +0.0% +0.0% +0.0% solid +0.0% 0.0% +0.0% +0.0% 0.0% sorting +0.0% 0.0% +0.0% -0.0% 0.0% spectral-norm +0.0% 0.0% -0.0% -0.0% -0.0% sphere +0.0% 0.0% +0.0% -1.0% 0.0% symalg +0.0% 0.0% +0.0% +0.0% +0.0% tak +0.0% 0.0% +0.0% +0.0% +0.0% transform +0.0% 0.0% +0.4% -1.3% +0.0% treejoin +0.0% 0.0% +0.0% -0.0% 0.0% typecheck +0.0% 0.0% -0.0% +0.0% 0.0% veritas +0.0% 0.0% +0.0% -0.1% +0.0% wang +0.0% 0.0% +0.0% +0.0% +0.0% wave4main +0.0% 0.0% +0.0% 0.0% -0.0% wheel-sieve1 +0.0% 0.0% +0.0% +0.0% +0.0% wheel-sieve2 +0.0% 0.0% +0.0% +0.0% 0.0% x2n1 +0.0% 0.0% +0.0% +0.0% 0.0% -------------------------------------------------------------------------------- Min +0.0% 0.0% -0.0% -6.1% -0.0% Max +0.1% 0.0% +1.2% +0.0% +0.0% Geometric Mean +0.0% -0.0% +0.0% -0.1% -0.0% NoFib GC Results ================ -------------------------------------------------------------------------------- Program Size Allocs Instrs Reads Writes -------------------------------------------------------------------------------- circsim +0.0% 0.0% -0.0% -0.0% -0.0% constraints +0.0% 0.0% -0.0% 0.0% -0.0% fibheaps +0.0% 0.0% 0.0% -0.0% -0.0% fulsom +0.0% 0.0% 0.0% -0.6% -0.0% gc_bench +0.0% 0.0% 0.0% 0.0% -0.0% hash +0.0% 0.0% -0.0% -0.0% -0.0% lcss +0.0% 0.0% 0.0% -0.0% 0.0% mutstore1 +0.0% 0.0% 0.0% -0.0% -0.0% mutstore2 +0.0% 0.0% +0.0% -0.0% -0.0% power +0.0% 0.0% -0.0% 0.0% -0.0% spellcheck +0.0% 0.0% -0.0% -0.0% -0.0% -------------------------------------------------------------------------------- Min +0.0% 0.0% -0.0% -0.6% -0.0% Max +0.0% 0.0% +0.0% 0.0% 0.0% Geometric Mean +0.0% +0.0% +0.0% -0.1% +0.0% Fixes #14373 These performance regressions appear to be a fluke in CI. See the discussion in !1742 for details. Metric Increase: T6048 T12234 T12425 Naperian T12150 T5837 T13035
* Clean up `#include`s in the compilerJohn Ericson2019-10-051-2/+0
| | | | | | | | - Remove unneeded ones - Use <..> for inter-package. Besides general clean up, helps distinguish between the RTS we link against vs the RTS we compile for.
* Module hierarchy: StgToCmm (#13009)Sylvain Henry2019-09-101-0/+1008
Add StgToCmm module hierarchy. Platform modules that are used in several other places (NCG, LLVM codegen, Cmm transformations) are put into GHC.Platform.