|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This adds a number of changes to ticky-ticky profiling.
When an executable is profiled with IPE profiling it's now possible to
associate id-related ticky counters to their source location.
This works by emitting the info table address as part of the counter
which can be looked up in the IPE table.
Add a `-ticky-ap-thunk` flag. This flag prevents the use of some standard thunks
which are precompiled into the RTS. This means reduced cache locality
and increased code size. But it allows better attribution of execution
cost to specific source locations instead of simple attributing it to
the standard thunk.
ticky-ticky now uses the `arg` field to emit additional information
about counters in json format. When ticky-ticky is used in combination
with the eventlog eventlog2html can be used to generate a html table
from the eventlog similar to the old text output for ticky-ticky. | 
| | 
| 
| 
| 
| 
| 
| 
| | Remove these smart constructors for these reasons:
* mkLocalClosureTableLabel : Does the same as the non-local variant.
* mkLocalClosureLabel      : Does the same as the non-local variant.
* mkLocalInfoTableLabel    : Decide if we make a local label based on the name
                             and just use mkInfoTableLabel everywhere. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This does three major things:
* Enforce the invariant that all strict fields must contain tagged
pointers.
* Try to predict the tag on bindings in order to omit tag checks.
* Allows functions to pass arguments unlifted (call-by-value).
The former is "simply" achieved by wrapping any constructor allocations with
a case which will evaluate the respective strict bindings.
The prediction is done by a new data flow analysis based on the STG
representation of a program. This also helps us to avoid generating
redudant cases for the above invariant.
StrictWorkers are created by W/W directly and SpecConstr indirectly.
See the Note [Strict Worker Ids]
Other minor changes:
* Add StgUtil module containing a few functions needed by, but
  not specific to the tag analysis.
-------------------------
Metric Decrease:
	T12545
	T18698b
	T18140
	T18923
        LargeRecord
Metric Increase:
        LargeRecord
	ManyAlternatives
	ManyConstructors
	T10421
	T12425
	T12707
	T13035
	T13056
	T13253
	T13253-spj
	T13379
	T15164
	T18282
	T18304
	T18698a
	T1969
	T20049
	T3294
	T4801
	T5321FD
	T5321Fun
	T783
	T9233
	T9675
	T9961
	T19695
	WWRec
------------------------- | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | StgToCmm: add Config, remove CgInfoDownwards
StgToCmm: runC api change to take StgToCmmConfig
StgToCmm: CgInfoDownad -> StgToCmmConfig
StgToCmm.Monad: update getters/setters/withers
StgToCmm: remove CallOpts in StgToCmm.Closure
StgToCmm: remove dynflag references
StgToCmm: PtrOpts removed
StgToCmm: add TMap to config, Prof - dynflags
StgToCmm: add omit yields to config
StgToCmm.ExtCode: remove redundant import
StgToCmm.Heap: remove references to dynflags
StgToCmm: codeGen api change, DynFlags -> Config
StgToCmm: remove dynflags in Env and StgToCmm
StgToCmm.DataCon: remove dynflags references
StgToCmm: remove dynflag references in DataCon
StgToCmm: add backend avx flags to config
StgToCmm.Prim: remove dynflag references
StgToCmm.Expr: remove dynflag references
StgToCmm.Bind: remove references to dynflags
StgToCmm: move DoAlignSanitisation to Cmm.Type
StgToCmm: remove PtrOpts in Cmm.Parser.y
DynFlags: update ipInitCode api
StgToCmm: Config Module is single source of truth
StgToCmm: Lazy config breaks IORef deadlock
testsuite: bump countdeps threshold
StgToCmm.Config: strictify fields except UpdFrame
Strictifying UpdFrameOffset causes the RTS build with stage1 to
deadlock. Additionally, before the deadlock performance of the RTS
is noticeably slower.
StgToCmm.Config: add field descriptions
StgToCmm: revert strictify on Module in config
testsuite: update CountDeps tests
StgToCmm: update comment, fix exports
Specifically update comment about loopification passed into dynflags
then stored into stgToCmmConfig. And remove getDynFlags from
Monad.hs exports
Types.Name: add pprFullName function
StgToCmm.Ticky: use pprFullname, fixup ExtCode imports
Cmm.Info: revert cmmGetClosureType removal
StgToCmm.Bind: use pprFullName, Config update comments
StgToCmm: update closureDescription api
StgToCmm: SAT altHeapCheck
StgToCmm: default render for Info table, ticky
Use default rendering contexts for info table and ticky ticky, which should be independent of command line input.
testsuite: bump count deps
pprFullName: flag for ticky vs normal style output
convertInfoProvMap: remove unused parameter
StgToCmm.Config: add backend flags to config
StgToCmm.Config: remove Backend from Config
StgToCmm.Prim: refactor Backend call sites
StgToCmm.Prim: remove redundant imports
StgToCmm.Config: refactor vec compatibility check
StgToCmm.Config: add allowQuotRem2 flag
StgToCmm.Ticky: print internal names with parens
StgToCmm.Bind: dispatch ppr based on externality
StgToCmm: Add pprTickyname, Fix ticky naming
Accidently removed the ctx for ticky SDoc output. The only relevant flag
is sdocPprDebug which was accidental set to False due to using
defaultSDocContext without altering the flag.
StgToCmm: remove stateful fields in config
fixup: config: remove redundant imports
StgToCmm: move Sequel type to its own module
StgToCmm: proliferate getCallMethod updated api
StgToCmm.Monad: add FCodeState to Monad Api
StgToCmm: add second reader monad to FCode
fixup: Prim.hs: missed a merge conflict
fixup: Match countDeps tests to HEAD
StgToCmm.Monad: withState -> withCgState
To disambiguate it from mtl withState. This withState shouldn't be
returning the new state as a value. However, fixing this means tackling
the knot tying in CgState and so is very difficult since it changes when
the thunk of the knot is forced which either leads to deadlock or to
compiler panic. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Fixes #20541 by making mkTyConApp do more sharing of types.
In particular, replace
* BoxedRep Lifted    ==>  LiftedRep
* BoxedRep Unlifted  ==>  UnliftedRep
* TupleRep '[]       ==>  ZeroBitRep
* TYPE ZeroBitRep    ==>  ZeroBitType
In each case, the thing on the right is a type synonym
for the thing on the left, declared in ghc-prim:GHC.Types.
See Note [Using synonyms to compress types] in GHC.Core.Type.
The synonyms for ZeroBitRep and ZeroBitType are new, but absolutely
in the same spirit as the other ones.   (These synonyms are mainly
for internal use, though the programmer can use them too.)
I also renamed GHC.Core.Ty.Rep.isVoidTy to isZeroBitTy, to be
compatible with the "zero-bit" nomenclature above.  See discussion
on !6806.
There is a tricky wrinkle: see GHC.Core.Types
  Note [Care using synonyms to compress types]
Compiler allocation decreases by up to 0.8%. | 
| | 
| 
| 
| | fixes #19628 | 
| | |  | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | Replace uses of WARN macro with calls to:
  warnPprTrace :: Bool -> SDoc -> a -> a
Remove the now unused HsVersions.h
Bump haddock submodule | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | There is no reason to use CPP. __LINE__ and __FILE__ macros are now
better replaced with GHC's CallStack. As a bonus, assert error messages
now contain more information (function name, column).
Here is the mapping table (HasCallStack omitted):
  * ASSERT:   assert     :: Bool -> a -> a
  * MASSERT:  massert    :: Bool -> m ()
  * ASSERTM:  assertM    :: m Bool -> m ()
  * ASSERT2:  assertPpr  :: Bool -> SDoc -> a -> a
  * MASSERT2: massertPpr :: Bool -> SDoc -> m ()
  * ASSERTM2: assertPprM :: m Bool -> SDoc -> m () | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | The `-fdistinct-constructor-tables` flag will generate a fresh info
table for the usage of any data constructor. This is useful for
debugging as now by inspecting the info table, you can determine which
usage of a constructor caused that allocation rather than the old
situation where the info table always mapped to the definition site of
the data constructor which is useless.
In conjunction with `-hi` and `-finfo-table-map` this gives a more fine
grained understanding of where constructor allocations arise from in a
program. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This new flag embeds a lookup table from the address of an info table
to information about that info table.
The main interface for consulting the map is the `lookupIPE` C function
> InfoProvEnt * lookupIPE(StgInfoTable *info)
The `InfoProvEnt` has the following structure:
> typedef struct InfoProv_{
>     char * table_name;
>     char * closure_desc;
>     char * ty_desc;
>     char * label;
>     char * module;
>     char * srcloc;
> } InfoProv;
>
> typedef struct InfoProvEnt_ {
>     StgInfoTable * info;
>     InfoProv prov;
>     struct InfoProvEnt_ *link;
> } InfoProvEnt;
The source positions are approximated in a similar way to the source
positions for DWARF debugging information. They are only approximate but
in our experience provide a good enough hint about where the problem
might be. It is therefore recommended to use this flag in conjunction
with `-g<n>` for more accurate locations.
The lookup table is also emitted into the eventlog when it is available
as it is intended to be used with the `-hi` profiling mode.
Using this flag will significantly increase the size of the resulting
object file but only by a factor of 2-3x in our experience. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Co-authored-by: Rinat Stryungis <rinat.stryungis@serokell.io>
Implement GHC Proposal #387
* Parse char literals 'x' at the type level
* New built-in type families CmpChar, ConsSymbol, UnconsSymbol
* New KnownChar class (cf. KnownSymbol and KnownNat)
* New SomeChar type (cf. SomeSymbol and SomeNat)
* CharTyLit support in template-haskell
Updated submodules: binary, haddock.
Metric Decrease:
    T5205
    haddock.base
Metric Increase:
    Naperian
    T13035 | 
| | 
| 
| 
| 
| | Add a type parameter for the environment required by OutputableP. It
avoids tying Platform with OutputableP. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Some types need a Platform value to be pretty-printed: CLabel, Cmm
types, instructions, etc.
Before this patch they had an Outputable instance and the Platform value
was obtained via sdocWithDynFlags. It meant that the *renderer* of the
SDoc was responsible of passing the appropriate Platform value (e.g. via
the DynFlags given to showSDoc).  It put the burden of passing the
Platform value on the renderer while the generator of the SDoc knows the
Platform it is generating the SDoc for and there is no point passing a
different Platform at rendering time.
With this patch, we introduce a new OutputableP class:
   class OutputableP a where
      pdoc :: Platform -> a -> SDoc
With this class we still have some polymorphism as we have with `ppr`
(i.e. we can use `pdoc` on a variety of types instead of having a
dedicated `pprXXX` function for each XXX type).
One step closer removing `sdocWithDynFlags` (#10143) and supporting
several platforms (#14335). | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | - put panic related functions into GHC.Utils.Panic
- put trace related functions using DynFlags in GHC.Driver.Ppr
One step closer making Outputable fully independent of DynFlags.
Bump haddock submodule | 
| | 
| 
| 
| 
| 
| 
| 
| | Pretty-printing CLabel relies on sdocWithDynFlags that we want to remove
(#10143, #17957). It uses it to query the backend and the platform.
This patch exposes Clabel ppr functions specialised for each backend so
that backend code can directly use them. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | Platform constant wrappers took a DynFlags parameter, hence implicitly
used the target platform constants. We removed them to allow support
for several platforms at once (#14335) and to avoid having to pass
the full DynFlags to every function (#17957).
Metric Decrease:
   T4801 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | SCC profiling was enabled in a convoluted way: if WayProf was enabled,
Opt_SccProfilingOn general flag was set (in
`GHC.Driver.Ways.wayGeneralFlags`), and then this flag was queried in
various places.
There is no need to go via general flags, so this patch defines a
`sccProfilingEnabled :: DynFlags -> Bool` helper function that just
checks whether WayProf is enabled. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This updates haddock comments only.
This patch focuses to update for hyperlinks in GHC API's haddock comments,
because broken links especially discourage newcomers.
This includes the following hierarchies:
  - GHC.Iface.*
  - GHC.Llvm.*
  - GHC.Rename.*
  - GHC.Tc.*
  - GHC.HsToCore.*
  - GHC.StgToCmm.*
  - GHC.CmmToAsm.*
  - GHC.Runtime.*
  - GHC.Unit.*
  - GHC.Utils.*
  - GHC.SysTools.* | 
| | 
| 
| 
| 
| 
| 
| | tablesNextToCode is a platform setting and doesn't belong into DynFlags
(#17957). Doing this is also a prerequisite to fix #14335 where we deal
with two platforms (target and host) that may have different platform
settings. | 
| | 
| 
| 
| 
| | It avoids using DynFlags in the Outputable instance of Clabel to check
assertions at pretty-printing time. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | - Store LambdaFormInfos of exported Ids in interface files
- Use them in importing modules
This is for optimization purposes: if we know LambdaFormInfo of imported
Ids we can generate more efficient calling code, see `getCallMethod`.
Exporting (putting them in interface files or in ModDetails) and
importing (reading them from interface files) are both optional. We
don't assume known LambdaFormInfos anywhere and do not change how we
call Ids with unknown LambdaFormInfos.
Runtime, allocation, and residency numbers when building
Cabal-the-library (commit 0d4ee7ba3):
(Log and .hp files are in the MR: !2842)
|     | GHC HEAD | This patch | Diff           |
|-----|----------|------------|----------------|
| -O0 |  0:35.89 |    0:34.10 | -1.78s, -4.98% |
| -O1 |  2:24.01 |    2:23.62 | -0.39s, -0.27% |
| -O2 |  2:52.23 |    2:51.35 | -0.88s, -0.51% |
|     | GHC HEAD        | This patch      | Diff                       |
|-----|-----------------|-----------------|----------------------------|
| -O0 |  54,843,608,416 |  54,878,769,544 |  +35,161,128 bytes, +0.06% |
| -O1 | 227,136,076,400 | 227,569,045,168 | +432,968,768 bytes, +0.19% |
| -O2 | 266,147,063,296 | 266,749,643,440 | +602,580,144 bytes, +0.22% |
NOTE: Residency is measured with extra runtime args: `-i0 -h` which effectively
turn all GCs into major GCs, and do GC more often.
|     | GHC HEAD                   | This patch                   | Diff                       |
|-----|----------------------------|------------------------------|----------------------------|
| -O0 | 410,284,000 (910 samples)  | 411,745,008 (906 samples)    | +1,461,008 bytes, +0.35%   |
| -O1 | 928,580,856 (2109 samples) | 943,506,552 (2103 samples)   | +14,925,696 bytes, +1.60%  |
| -O2 | 993,951,352 (2549 samples) | 1,010,156,328 (2545 samples) | +16,204,9760 bytes, +1.63% |
NoFib results:
--------------------------------------------------------------------------------
        Program           Size    Allocs    Instrs     Reads    Writes
--------------------------------------------------------------------------------
             CS           0.0%      0.0%     +0.0%     +0.0%     +0.0%
            CSD           0.0%      0.0%      0.0%     +0.0%     +0.0%
             FS           0.0%      0.0%     +0.0%     +0.0%     +0.0%
              S           0.0%      0.0%     +0.0%     +0.0%     +0.0%
             VS           0.0%      0.0%     +0.0%     +0.0%     +0.0%
            VSD           0.0%      0.0%     +0.0%     +0.0%     +0.1%
            VSM           0.0%      0.0%     +0.0%     +0.0%     +0.0%
           anna           0.0%      0.0%     -0.3%     -0.8%     -0.0%
           ansi           0.0%      0.0%     -0.0%     -0.0%      0.0%
           atom           0.0%      0.0%     -0.0%     -0.0%      0.0%
         awards           0.0%      0.0%     -0.1%     -0.3%      0.0%
         banner           0.0%      0.0%     -0.0%     -0.0%     -0.0%
     bernouilli           0.0%      0.0%     -0.0%     -0.0%     -0.0%
   binary-trees           0.0%      0.0%     -0.0%     -0.0%     +0.0%
          boyer           0.0%      0.0%     -0.0%     -0.0%      0.0%
         boyer2           0.0%      0.0%     -0.0%     -0.0%      0.0%
           bspt           0.0%      0.0%     -0.0%     -0.2%      0.0%
      cacheprof           0.0%      0.0%     -0.1%     -0.4%     +0.0%
       calendar           0.0%      0.0%     -0.0%     -0.0%      0.0%
       cichelli           0.0%      0.0%     -0.9%     -2.4%      0.0%
        circsim           0.0%      0.0%     -0.0%     -0.0%      0.0%
       clausify           0.0%      0.0%     -0.1%     -0.3%      0.0%
  comp_lab_zift           0.0%      0.0%     -0.0%     -0.0%     +0.0%
       compress           0.0%      0.0%     -0.0%     -0.0%     -0.0%
      compress2           0.0%      0.0%     -0.0%     -0.0%      0.0%
    constraints           0.0%      0.0%     -0.1%     -0.2%     -0.0%
   cryptarithm1           0.0%      0.0%     -0.0%     -0.0%      0.0%
   cryptarithm2           0.0%      0.0%     -1.4%     -4.1%     -0.0%
            cse           0.0%      0.0%     -0.0%     -0.0%     -0.0%
   digits-of-e1           0.0%      0.0%     -0.0%     -0.0%     -0.0%
   digits-of-e2           0.0%      0.0%     -0.0%     -0.0%     -0.0%
         dom-lt           0.0%      0.0%     -0.1%     -0.2%      0.0%
          eliza           0.0%      0.0%     -0.5%     -1.5%      0.0%
          event           0.0%      0.0%     -0.0%     -0.0%     -0.0%
    exact-reals           0.0%      0.0%     -0.1%     -0.3%     +0.0%
         exp3_8           0.0%      0.0%     -0.0%     -0.0%     -0.0%
         expert           0.0%      0.0%     -0.3%     -1.0%     -0.0%
 fannkuch-redux           0.0%      0.0%     +0.0%     +0.0%     +0.0%
          fasta           0.0%      0.0%     -0.0%     -0.0%     +0.0%
            fem           0.0%      0.0%     -0.0%     -0.0%      0.0%
            fft           0.0%      0.0%     -0.0%     -0.0%      0.0%
           fft2           0.0%      0.0%     -0.0%     -0.0%      0.0%
       fibheaps           0.0%      0.0%     -0.0%     -0.0%     +0.0%
           fish           0.0%      0.0%      0.0%     -0.0%     +0.0%
          fluid           0.0%      0.0%     -0.4%     -1.2%     +0.0%
         fulsom           0.0%      0.0%     -0.0%     -0.0%      0.0%
         gamteb           0.0%      0.0%     -0.1%     -0.3%      0.0%
            gcd           0.0%      0.0%     -0.0%     -0.0%      0.0%
    gen_regexps           0.0%      0.0%     -0.0%     -0.0%     -0.0%
         genfft           0.0%      0.0%     -0.0%     -0.0%      0.0%
             gg           0.0%      0.0%     -0.0%     -0.0%     +0.0%
           grep           0.0%      0.0%     -0.0%     -0.0%     -0.0%
         hidden           0.0%      0.0%     -0.1%     -0.4%     -0.0%
            hpg           0.0%      0.0%     -0.2%     -0.5%     +0.0%
            ida           0.0%      0.0%     -0.0%     -0.0%     +0.0%
          infer           0.0%      0.0%     -0.3%     -0.8%     -0.0%
        integer           0.0%      0.0%     -0.0%     -0.0%     +0.0%
      integrate           0.0%      0.0%     -0.0%     -0.0%      0.0%
   k-nucleotide           0.0%      0.0%     -0.0%     -0.0%     +0.0%
          kahan           0.0%      0.0%     -0.0%     -0.0%     +0.0%
        knights           0.0%      0.0%     -2.2%     -5.4%      0.0%
         lambda           0.0%      0.0%     -0.6%     -1.8%      0.0%
     last-piece           0.0%      0.0%     -0.0%     -0.0%      0.0%
           lcss           0.0%      0.0%     -0.0%     -0.1%      0.0%
           life           0.0%      0.0%     -0.0%     -0.1%      0.0%
           lift           0.0%      0.0%     -0.2%     -0.6%     +0.0%
         linear           0.0%      0.0%     -0.0%     -0.0%     -0.0%
      listcompr           0.0%      0.0%     -0.0%     -0.0%      0.0%
       listcopy           0.0%      0.0%     -0.0%     -0.0%      0.0%
       maillist           0.0%      0.0%     -0.1%     -0.3%     +0.0%
         mandel           0.0%      0.0%     -0.0%     -0.0%      0.0%
        mandel2           0.0%      0.0%     -0.0%     -0.0%     -0.0%
           mate          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
        minimax           0.0%      0.0%     -0.2%     -1.0%      0.0%
        mkhprog           0.0%      0.0%     -0.1%     -0.2%     -0.0%
     multiplier           0.0%      0.0%     -0.0%     -0.0%     -0.0%
         n-body           0.0%      0.0%     -0.0%     -0.0%     +0.0%
       nucleic2           0.0%      0.0%     -0.1%     -0.2%      0.0%
           para           0.0%      0.0%     -0.0%     -0.0%     -0.0%
      paraffins           0.0%      0.0%     -0.0%     -0.0%      0.0%
         parser           0.0%      0.0%     -0.2%     -0.7%      0.0%
        parstof           0.0%      0.0%     -0.0%     -0.0%     +0.0%
            pic           0.0%      0.0%     -0.0%     -0.0%      0.0%
       pidigits           0.0%      0.0%     +0.0%     +0.0%     +0.0%
          power           0.0%      0.0%     -0.2%     -0.6%     +0.0%
         pretty           0.0%      0.0%     -0.0%     -0.0%     -0.0%
         primes           0.0%      0.0%     -0.0%     -0.0%      0.0%
      primetest           0.0%      0.0%     -0.0%     -0.0%     -0.0%
         prolog           0.0%      0.0%     -0.3%     -1.1%      0.0%
         puzzle           0.0%      0.0%     -0.0%     -0.0%      0.0%
         queens           0.0%      0.0%     -0.0%     -0.0%     +0.0%
        reptile           0.0%      0.0%     -0.0%     -0.0%      0.0%
reverse-complem           0.0%      0.0%     -0.0%     -0.0%     +0.0%
        rewrite           0.0%      0.0%     -0.7%     -2.5%     -0.0%
           rfib           0.0%      0.0%     -0.0%     -0.0%      0.0%
            rsa           0.0%      0.0%     -0.0%     -0.0%      0.0%
            scc           0.0%      0.0%     -0.1%     -0.2%     -0.0%
          sched           0.0%      0.0%     -0.0%     -0.0%     -0.0%
            scs           0.0%      0.0%     -1.0%     -2.6%     +0.0%
         simple           0.0%      0.0%     +0.0%     -0.0%     +0.0%
          solid           0.0%      0.0%     -0.0%     -0.0%      0.0%
        sorting           0.0%      0.0%     -0.6%     -1.6%      0.0%
  spectral-norm           0.0%      0.0%     +0.0%      0.0%     +0.0%
         sphere           0.0%      0.0%     -0.0%     -0.0%     -0.0%
         symalg           0.0%      0.0%     -0.0%     -0.0%     +0.0%
            tak           0.0%      0.0%     -0.0%     -0.0%      0.0%
      transform           0.0%      0.0%     -0.0%     -0.0%      0.0%
       treejoin           0.0%      0.0%     -0.0%     -0.0%      0.0%
      typecheck           0.0%      0.0%     -0.0%     -0.0%     +0.0%
        veritas          +0.0%      0.0%     -0.2%     -0.4%     +0.0%
           wang           0.0%      0.0%     -0.0%     -0.0%      0.0%
      wave4main           0.0%      0.0%     -0.0%     -0.0%     -0.0%
   wheel-sieve1           0.0%      0.0%     -0.0%     -0.0%     -0.0%
   wheel-sieve2           0.0%      0.0%     -0.0%     -0.0%     +0.0%
           x2n1           0.0%      0.0%     -0.0%     -0.0%     -0.0%
--------------------------------------------------------------------------------
            Min           0.0%      0.0%     -2.2%     -5.4%     -0.0%
            Max          +0.0%      0.0%     +0.0%     +0.0%     +0.1%
 Geometric Mean          -0.0%     -0.0%     -0.1%     -0.3%     +0.0%
Metric increases micro benchmarks tracked in #17686:
Metric Increase:
    T12150
    T12234
    T12425
    T13035
    T5837
    T6048
    T9233
Co-authored-by: Andreas Klebinger <klebinger.andreas@gmx.at> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This updates comments only.
This patch replaces leaf module names according to new module
hierarchy [1][2] as followings:
* Expand leaf names to easily find the module path:
  for instance, `Id.hs` to `GHC.Types.Id`.
* Modify leaf names according to new module hierarchy:
  for instance, `Convert.hs` to `GHC.ThToHs`.
* Fix typo:
  for instance, `GHC.Core.TyCo.Rep.hs` to `GHC.Core.TyCo.Rep`
See also !3375
[1]: https://gitlab.haskell.org/ghc/ghc/-/wikis/Make-GHC-codebase-more-modular
[2]: https://gitlab.haskell.org/ghc/ghc/issues/13009 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | The field is only used in withNewTickyCounterFun and it's easier to
directly pass a parameter for one-shot info to withNewTickyCounterFun
instead of passing it via LFReEntrant. This also makes !2842 simpler.
Other changes:
- New Note (by SPJ) [OneShotInfo overview] added.
- Arity argument of thunkCode removed as it's always 0. | 
| | 
| 
| 
| 
| 
| 
| | Update Haddock submodule
Metric Increase:
   haddock.compiler | 
| | 
| 
| 
| | Update Haddock submodule | 
| | 
| 
| 
| 
| 
| 
| | Update Haddock submodule
Metric Increase:
   haddock.compiler | 
| | 
| 
| 
| | Update submodule: haddock | 
| | 
| 
| 
| | submodule updates: nofib, haddock | 
| | 
| 
| 
| | Update haddock submodule | 
| | |  | 
| | |  | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Formerly we punted on these and evaluated constructors always got a tag
of 1.
We now cascade switches because we have to check the tag first and when
it is MAX_PTR_TAG then get the precise tag from the info table and
switch on that. The only technically tricky part is that the default
case needs (logical) duplication. To do this we emit an extra label for
it and branch to that from the second switch. This avoids duplicated
codegen.
Here's a simple example of the new code gen:
    data D = D1 | D2 | D3 | D4 | D5 | D6 | D7 | D8
On a 64-bit system previously all constructors would be tagged 1. With
the new code gen D7 and D8 are tagged 7:
    [Lib.D7_con_entry() {
         ...
         {offset
           c1eu: // global
               R1 = R1 + 7;
               call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
         }
     }]
    [Lib.D8_con_entry() {
         ...
         {offset
           c1ez: // global
               R1 = R1 + 7;
               call (P64[Sp])(R1) args: 8, res: 0, upd: 8;
         }
     }]
When switching we now look at the info table only when the tag is 7. For
example, if we derive Enum for the type above, the Cmm looks like this:
    c2Le:
        _s2Js::P64 = R1;
        _c2Lq::P64 = _s2Js::P64 & 7;
        switch [1 .. 7] _c2Lq::P64 {
            case 1 : goto c2Lk;
            case 2 : goto c2Ll;
            case 3 : goto c2Lm;
            case 4 : goto c2Ln;
            case 5 : goto c2Lo;
            case 6 : goto c2Lp;
            case 7 : goto c2Lj;
        }
    // Read info table for tag
    c2Lj:
        _c2Lv::I64 = %MO_UU_Conv_W32_W64(I32[I64[_s2Js::P64 & (-8)] - 4]);
        if (_c2Lv::I64 != 6) goto c2Lu; else goto c2Lt;
Generated Cmm sizes do not change too much, but binaries are very
slightly larger, due to the fact that the new instructions are longer in
encoded form. E.g. previously entry code for D8 above would be
    00000000000001c0 <Lib_D8_con_info>:
     1c0:	48 ff c3             	inc    %rbx
     1c3:	ff 65 00             	jmpq   *0x0(%rbp)
With this patch
    00000000000001d0 <Lib_D8_con_info>:
     1d0:	48 83 c3 07          	add    $0x7,%rbx
     1d4:	ff 65 00             	jmpq   *0x0(%rbp)
This is one byte longer.
Secondly, reading info table directly and then switching is shorter
    _c1co:
            movq -1(%rbx),%rax
            movl -4(%rax),%eax
            // Switch on info table tag
            jmp *_n1d5(,%rax,8)
than doing the same switch, and then for the tag 7 doing another switch:
    // When tag is 7
    _c1ct:
            andq $-8,%rbx
            movq (%rbx),%rax
            movl -4(%rax),%eax
            // Switch on info table tag
            ...
Some changes of binary sizes in actual programs:
- In NoFib the worst case is 0.1% increase in benchmark "parser" (see
  NoFib results below). All programs get slightly larger.
- Stage 2 compiler size does not change.
- In "containers" (the library) size of all object files increases
  0.0005%. Size of the test program "bitqueue-properties" increases
  0.03%.
nofib benchmarks kindly provided by Ömer (@osa1):
NoFib Results
=============
--------------------------------------------------------------------------------
        Program           Size    Allocs    Instrs     Reads    Writes
--------------------------------------------------------------------------------
             CS          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
            CSD          +0.0%      0.0%      0.0%     +0.0%     +0.0%
             FS          +0.0%      0.0%      0.0%     +0.0%      0.0%
              S          +0.0%      0.0%     -0.0%      0.0%      0.0%
             VS          +0.0%      0.0%     -0.0%     +0.0%     +0.0%
            VSD          +0.0%      0.0%     -0.0%     +0.0%     -0.0%
            VSM          +0.0%      0.0%      0.0%      0.0%      0.0%
           anna          +0.0%      0.0%     +0.1%     -0.9%     -0.0%
           ansi          +0.0%      0.0%     -0.0%     +0.0%     +0.0%
           atom          +0.0%      0.0%      0.0%      0.0%      0.0%
         awards          +0.0%      0.0%     -0.0%     +0.0%      0.0%
         banner          +0.0%      0.0%     -0.0%     +0.0%      0.0%
     bernouilli          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
   binary-trees          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
          boyer          +0.0%      0.0%     +0.0%      0.0%     -0.0%
         boyer2          +0.0%      0.0%     +0.0%      0.0%     -0.0%
           bspt          +0.0%      0.0%     +0.0%     +0.0%      0.0%
      cacheprof          +0.0%      0.0%     +0.1%     -0.8%      0.0%
       calendar          +0.0%      0.0%     -0.0%     +0.0%     -0.0%
       cichelli          +0.0%      0.0%     +0.0%      0.0%      0.0%
        circsim          +0.0%      0.0%     -0.0%     -0.1%     -0.0%
       clausify          +0.0%      0.0%     +0.0%     +0.0%      0.0%
  comp_lab_zift          +0.0%      0.0%     +0.0%      0.0%     -0.0%
       compress          +0.0%      0.0%     +0.0%     +0.0%      0.0%
      compress2          +0.0%      0.0%      0.0%      0.0%      0.0%
    constraints          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
   cryptarithm1          +0.0%      0.0%     +0.0%      0.0%      0.0%
   cryptarithm2          +0.0%      0.0%     +0.0%     -0.0%      0.0%
            cse          +0.0%      0.0%     +0.0%     +0.0%      0.0%
   digits-of-e1          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
   digits-of-e2          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
         dom-lt          +0.0%      0.0%     +0.0%     +0.0%      0.0%
          eliza          +0.0%      0.0%     -0.0%     +0.0%      0.0%
          event          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
    exact-reals          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
         exp3_8          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
         expert          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
 fannkuch-redux          +0.0%      0.0%     +0.0%      0.0%      0.0%
          fasta          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
            fem          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
            fft          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
           fft2          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
       fibheaps          +0.0%      0.0%     +0.0%     +0.0%      0.0%
           fish          +0.0%      0.0%     +0.0%     +0.0%      0.0%
          fluid          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
         fulsom          +0.0%      0.0%     +0.0%     -0.0%     +0.0%
         gamteb          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
            gcd          +0.0%      0.0%     +0.0%     +0.0%      0.0%
    gen_regexps          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
         genfft          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
             gg          +0.0%      0.0%      0.0%     -0.0%      0.0%
           grep          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
         hidden          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
            hpg          +0.0%      0.0%     +0.0%     -0.1%     -0.0%
            ida          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
          infer          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
        integer          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
      integrate          +0.0%      0.0%      0.0%     +0.0%      0.0%
   k-nucleotide          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
          kahan          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
        knights          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
         lambda          +0.0%      0.0%     +1.2%     -6.1%     -0.0%
     last-piece          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
           lcss          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
           life          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
           lift          +0.0%      0.0%     +0.0%     +0.0%      0.0%
         linear          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
      listcompr          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
       listcopy          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
       maillist          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
         mandel          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
        mandel2          +0.0%      0.0%     +0.0%     +0.0%     -0.0%
           mate          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
        minimax          +0.0%      0.0%     -0.0%     +0.0%     -0.0%
        mkhprog          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
     multiplier          +0.0%      0.0%      0.0%     +0.0%     -0.0%
         n-body          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
       nucleic2          +0.0%      0.0%     +0.0%     +0.0%     -0.0%
           para          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
      paraffins          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
         parser          +0.1%      0.0%     +0.4%     -1.7%     -0.0%
        parstof          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
            pic          +0.0%      0.0%     +0.0%      0.0%     -0.0%
       pidigits          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
          power          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
         pretty          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
         primes          +0.0%      0.0%     +0.0%      0.0%      0.0%
      primetest          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
         prolog          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
         puzzle          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
         queens          +0.0%      0.0%      0.0%     +0.0%     +0.0%
        reptile          +0.0%      0.0%     +0.0%     +0.0%      0.0%
reverse-complem          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
        rewrite          +0.0%      0.0%     +0.0%      0.0%     -0.0%
           rfib          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
            rsa          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
            scc          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
          sched          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
            scs          +0.0%      0.0%     +0.0%     +0.0%      0.0%
         simple          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
          solid          +0.0%      0.0%     +0.0%     +0.0%      0.0%
        sorting          +0.0%      0.0%     +0.0%     -0.0%      0.0%
  spectral-norm          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
         sphere          +0.0%      0.0%     +0.0%     -1.0%      0.0%
         symalg          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
            tak          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
      transform          +0.0%      0.0%     +0.4%     -1.3%     +0.0%
       treejoin          +0.0%      0.0%     +0.0%     -0.0%      0.0%
      typecheck          +0.0%      0.0%     -0.0%     +0.0%      0.0%
        veritas          +0.0%      0.0%     +0.0%     -0.1%     +0.0%
           wang          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
      wave4main          +0.0%      0.0%     +0.0%      0.0%     -0.0%
   wheel-sieve1          +0.0%      0.0%     +0.0%     +0.0%     +0.0%
   wheel-sieve2          +0.0%      0.0%     +0.0%     +0.0%      0.0%
           x2n1          +0.0%      0.0%     +0.0%     +0.0%      0.0%
--------------------------------------------------------------------------------
            Min          +0.0%      0.0%     -0.0%     -6.1%     -0.0%
            Max          +0.1%      0.0%     +1.2%     +0.0%     +0.0%
 Geometric Mean          +0.0%     -0.0%     +0.0%     -0.1%     -0.0%
NoFib GC Results
================
--------------------------------------------------------------------------------
        Program           Size    Allocs    Instrs     Reads    Writes
--------------------------------------------------------------------------------
        circsim          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
    constraints          +0.0%      0.0%     -0.0%      0.0%     -0.0%
       fibheaps          +0.0%      0.0%      0.0%     -0.0%     -0.0%
         fulsom          +0.0%      0.0%      0.0%     -0.6%     -0.0%
       gc_bench          +0.0%      0.0%      0.0%      0.0%     -0.0%
           hash          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
           lcss          +0.0%      0.0%      0.0%     -0.0%      0.0%
      mutstore1          +0.0%      0.0%      0.0%     -0.0%     -0.0%
      mutstore2          +0.0%      0.0%     +0.0%     -0.0%     -0.0%
          power          +0.0%      0.0%     -0.0%      0.0%     -0.0%
     spellcheck          +0.0%      0.0%     -0.0%     -0.0%     -0.0%
--------------------------------------------------------------------------------
            Min          +0.0%      0.0%     -0.0%     -0.6%     -0.0%
            Max          +0.0%      0.0%     +0.0%      0.0%      0.0%
 Geometric Mean          +0.0%     +0.0%     +0.0%     -0.1%     +0.0%
Fixes #14373
These performance regressions appear to be a fluke in CI. See the
discussion in !1742 for details.
Metric Increase:
    T6048
    T12234
    T12425
    Naperian
    T12150
    T5837
    T13035 | 
| | 
| 
| 
| 
| 
| 
| 
| | - Remove unneeded ones
 - Use <..> for inter-package.
   Besides general clean up, helps distinguish between the RTS we link
   against vs the RTS we compile for. | 
|  | Add StgToCmm module hierarchy. Platform modules that are used in several
other places (NCG, LLVM codegen, Cmm transformations) are put into
GHC.Platform. |