|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
Currently, an AP thunk like `sat = f a b c` will not have its own entry
point and info pointer and will instead reuse a generic apply thunk
like `stg_ap_4_upd`.
That's great from a code size perspective, but if `f` is a known
function, a specialised entry point with a plain call can be much faster
than figuring out the arity and doing dynamic dispatch.
This looks at `f`s arity to figure out if it is a known function and if so, it
will not lower it to a generic apply function.
Benchmark results are encouraging: No changes to allocation, but 0.2% less
counted instructions.
Test Plan: Validates locally
Reviewers: simonmar, osa1, simonpj, bgamari
Reviewed By: simonpj
Subscribers: rwbarton, carter
GHC Trac Issues: #16005
Differential Revision: https://phabricator.haskell.org/D5414 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
This implements a selective lambda-lifting pass late in the STG
pipeline.
Lambda lifting has the effect of avoiding closure allocation at the cost
of having to make former free vars available at call sites, possibly
enlarging closures surrounding call sites in turn.
We identify beneficial cases by means of an analysis that estimates
closure growth.
There's a Wiki page at
https://ghc.haskell.org/trac/ghc/wiki/LateLamLift.
Reviewers: simonpj, bgamari, simonmar
Reviewed By: simonpj
Subscribers: rwbarton, carter
GHC Trac Issues: #9476
Differential Revision: https://phabricator.haskell.org/D5224 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
Currently, `CoreToStg` annotates `StgRhsClosure`s with their set of non-global
free variables.  This free variable information is only needed in the final
code generation step (i.e. `StgCmm.codeGen`), which leads to transformations
such as `StgCse` and `StgUnarise` having to maintain this information.
This is tiresome and unnecessary, so this patch introduces a trees-to-grow-like
approach that only introduces the free variable set into the syntax tree in the
code gen pass, along with a free variable analysis on STG terms to generate
that information.
Fixes #15754.
Reviewers: simonpj, osa1, bgamari, simonmar
Reviewed By: osa1
Subscribers: rwbarton, carter
GHC Trac Issues: #15754
Differential Revision: https://phabricator.haskell.org/D5324 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | - The StgBinderInfo type was never used in the code gen, so the type, related
  computation in CoreToStg, and some comments about it are removed. See #15770
  for more details.
- Simplified CoreToStg after removing the StgBinderInfo computation: removed
  StgBinderInfo arguments and mfix stuff.
The StgBinderInfo values were not used in the code gen, but I still run nofib
just to make sure: 0.0% change in allocations and binary sizes.
Test Plan: Validated locally
Reviewers: simonpj, simonmar, bgamari, sgraf
Reviewed By: sgraf
Subscribers: AndreasK, sgraf, rwbarton, carter
Differential Revision: https://phabricator.haskell.org/D5232 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
The idea here is to save a little code size and some work in the GC,
by collapsing FUN_STATIC closures and their SRTs.
This is (4) in a series; see D4632 for more details.
There's a tradeoff here: more complexity in the compiler in exchange
for a modest code size reduction (probably around 0.5%).
Results:
* GHC binary itself (statically linked) is 1% smaller
* -0.2% binary sizes in nofib (-0.5% module sizes)
Full nofib results comparing D4634 with this: P177 (ignore runtimes,
these aren't stable on my laptop)
Test Plan: validate, nofib
Reviewers: bgamari, niteria, simonpj, erikd
Subscribers: thomie, carter
Differential Revision: https://phabricator.haskell.org/D4637 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This removes a bunch of unnecessary includes of `HsVersions.h` along
with unnecessary CPP (e.g., due to checking for DEBUG which can be
achieved by looking at `debugIsOn`)
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: ./validate
Reviewers: bgamari, simonmar
Reviewed By: bgamari
Subscribers: rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4462 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This was broken by D3746 and/or D3809, but unfortunately we didn't
notice because CI at the time wasn't building the profiling way.
Test Plan:
```
cd testsuite/test/profiling/should_run
make WAY=ghci-ext-prof
```
Reviewers: bgamari, michalt, hvr, erikd
Subscribers: rwbarton, thomie, carter
GHC Trac Issues: #14705
Differential Revision: https://phabricator.haskell.org/D4437 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Test Plan: validate
Reviewers: bgamari, erikd
Reviewed By: bgamari
Subscribers: rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4380 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This is another step for fixing #13825 and is based on D38 by Simon
Marlow.
The change allows storing multiple constructor fields within the same
word. This currently applies only to `Float`s, e.g.,
```
data Foo = Foo {-# UNPACK #-} !Float {-# UNPACK #-} !Float
```
on 64-bit arch, will now store both fields within the same constructor
word. For `WordX/IntX` we'll need to introduce new primop types.
Main changes:
- We now use sizes in bytes when we compute the offsets for
  constructor fields in `StgCmmLayout` and introduce padding if
  necessary (word-sized fields are still word-aligned)
- `ByteCodeGen` had to be updated to correctly construct the data
  types. This required some new bytecode instructions to allow pushing
  things that are not full words onto the stack (and updating
  `Interpreter.c`). Note that we only use the packed stuff when
  constructing data types (i.e., for `PACK`), in all other cases the
  behavior should not change.
- `RtClosureInspect` was changed to handle the new layout when
  extracting subterms. This seems to be used by things like `:print`.
  I've also added a test for this.
- I deviated slightly from Simon's approach and use `PrimRep` instead
  of `ArgRep` for computing the size of fields.  This seemed more
  natural and in the future we'll probably want to introduce new
  primitive types (e.g., `Int8#`) and `PrimRep` seems like a better
  place to do that (where we already have `Int64Rep` for example).
  `ArgRep` on the other hand seems to be more focused on calling
  functions.
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: ./validate
Reviewers: bgamari, simonmar, austin, hvr, goldfire, erikd
Reviewed By: bgamari
Subscribers: maoe, rwbarton, thomie
GHC Trac Issues: #13825
Differential Revision: https://phabricator.haskell.org/D3809 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This switches the compiler/ component to get compiled with
-XNoImplicitPrelude and a `import GhcPrelude` is inserted in all
modules.
This is motivated by the upcoming "Prelude" re-export of
`Semigroup((<>))` which would cause lots of name clashes in every
modulewhich imports also `Outputable`
Reviewers: austin, goldfire, bgamari, alanz, simonmar
Reviewed By: bgamari
Subscribers: goldfire, rwbarton, thomie, mpickering, bgamari
Differential Revision: https://phabricator.haskell.org/D3989 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This seems like a clearer name and the fewer functions that
one needs to remember, the better.
Test Plan: validate
Reviewers: austin, simonmar, michalt
Reviewed By: simonmar, michalt
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2735 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | New unarise (714bebf) eliminates void binders in patterns already, so no
need to eliminate them here. I leave assertions to make sure this is the
case.
Assertion failure -> bug in unarise
Reviewers: bgamari, simonpj, austin, simonmar, hvr
Reviewed By: simonpj
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2416 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | The idea behind adding special "rubbish" arguments was in unboxed sum types
depending on the tag some arguments are not used and we don't want to move some
special values (like 0 for literals and some special pointer for boxed slots)
for those arguments (to stack locations or registers). "StgRubbishArg" was an
indicator to the code generator that the value won't be used. During Stg-to-Cmm
we were then not generating any move or store instructions at all.
This caused problems in the register allocator because some variables were only
initialized in some code paths. As an example, suppose we have this STG: (after
unarise)
    Lib.$WT =
        \r [dt_sit]
            case
                case dt_sit of {
                  Lib.F dt_siv [Occ=Once] ->
                      (#,,#) [1# dt_siv StgRubbishArg::GHC.Prim.Int#];
                  Lib.I dt_siw [Occ=Once] ->
                      (#,,#) [2# StgRubbishArg::GHC.Types.Any dt_siw];
                }
            of
            dt_six
            { (#,,#) us_giC us_giD us_giE -> Lib.T [us_giC us_giD us_giE];
            };
This basically unpacks a sum type to an unboxed sum with 3 fields, and then
moves the unboxed sum to a constructor (`Lib.T`).
This is the Cmm for the inner case expression (case expression in the scrutinee
position of the outer case):
    ciN:
        ...
        -- look at dt_sit's tag
        if (_ciT::P64 != 1) goto ciS; else goto ciR;
    ciS: -- Tag is 2, i.e. Lib.F
        _siw::I64 = I64[_siu::P64 + 6];
        _giE::I64 = _siw::I64;
        _giD::P64 = stg_RUBBISH_ENTRY_info;
        _giC::I64 = 2;
        goto ciU;
    ciR: -- Tag is 1, i.e. Lib.I
        _siv::P64 = P64[_siu::P64 + 7];
        _giD::P64 = _siv::P64;
        _giC::I64 = 1;
        goto ciU;
Here one of the blocks `ciS` and `ciR` is executed and then the execution
continues to `ciR`, but only `ciS` initializes `_giE`, in the other branch
`_giE` is not initialized, because it's "rubbish" in the STG and so we don't
generate an assignment during code generator. The code generator then panics
during the register allocations:
    ghc-stage1: panic! (the 'impossible' happened)
      (GHC version 8.1.20160722 for x86_64-unknown-linux):
            LocalReg's live-in to graph ciY {_giE::I64}
(`_giD` is also "rubbish" in `ciS`, but it's still initialized because it's a
pointer slot, we have to initialize it otherwise garbage collector follows the
pointer to some random place. So we only remove assignment if the "rubbish" arg
has unboxed type.)
This patch removes `StgRubbishArg` and `CmmArg`. We now always initialize
rubbish slots. If the slot is for boxed types we use the existing `absentError`,
otherwise we initialize the slot with literal 0.
Reviewers: simonpj, erikd, austin, simonmar, bgamari
Reviewed By: erikd
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2446 | 
| | |  | 
| | |  | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
This patch implements primitive unboxed sum types, as described in
https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes.
Main changes are:
- Add new syntax for unboxed sums types, terms and patterns. Hidden
  behind `-XUnboxedSums`.
- Add unlifted unboxed sum type constructors and data constructors,
  extend type and pattern checkers and desugarer.
- Add new RuntimeRep for unboxed sums.
- Extend unarise pass to translate unboxed sums to unboxed tuples right
  before code generation.
- Add `StgRubbishArg` to `StgArg`, and a new type `CmmArg` for better
  code generation when sum values are involved.
- Add user manual section for unboxed sums.
Some other changes:
- Generalize `UbxTupleRep` to `MultiRep` and `UbxTupAlt` to
  `MultiValAlt` to be able to use those with both sums and tuples.
- Don't use `tyConPrimRep` in `isVoidTy`: `tyConPrimRep` is really
  wrong, given an `Any` `TyCon`, there's no way to tell what its kind
  is, but `kindPrimRep` and in turn `tyConPrimRep` returns `PtrRep`.
- Fix some bugs on the way: #12375.
Not included in this patch:
- Update Haddock for new the new unboxed sum syntax.
- `TemplateHaskell` support is left as future work.
For reviewers:
- Front-end code is mostly trivial and adapted from unboxed tuple code
  for type checking, pattern checking, renaming, desugaring etc.
- Main translation routines are in `RepType` and `UnariseStg`.
  Documentation in `UnariseStg` should be enough for understanding
  what's going on.
Credits:
- Johan Tibell wrote the initial front-end and interface file
  extensions.
- Simon Peyton Jones reviewed this patch many times, wrote some code,
  and helped with debugging.
Reviewers: bgamari, alanz, goldfire, RyanGlScott, simonpj, austin,
           simonmar, hvr, erikd
Reviewed By: simonpj
Subscribers: Iceland_jack, ggreif, ezyang, RyanGlScott, goldfire,
             thomie, mpickering
Differential Revision: https://phabricator.haskell.org/D2259 | 
| | 
| 
| 
| 
| | (likely introduced by 99d4e5b4a0bd32813ff8c74e91d2dcf6b3555176, possibly
due to a merge mistake). | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | The report now distinguishes thunks (in the variants single-entry and
standard thunks), constructors and functions (possibly single-entry).
Forthermore, for standard thunks (AP and selector), do not count an
entry when they are allocated. It is not possible to count their
entries, as their code is shared, but better count nothing than count
the wrong thing. | 
| | 
| 
| 
| 
| | This reverts commit 6c2c853b11fe25c106469da7b105e2be596c17de which was
supposed to be merged as individual commits. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | this Diff contains small, self-contained changes as I work towards
fixing #10613. It is mostly created to let harbormaster do its job, but
feedback is welcome as well.
Please do not merge this via arc; I’d like to push the individual
patches as layed out here. I might push mostly trivial ones even without
review, as long as the build passes.
Reviewers: austin, bgamari
Reviewed By: bgamari
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2014 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | Reviewers: austin, bgamari, simonpj
Reviewed By: simonpj
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D1933 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | We also need to update `stgBindHasCafRefs` assertion with this change,
as we no longer have the pre-computed SRT, LiveVars etc. We rename it to
`topStgBindHasCafRefs` and implement it like this:
A non-updatable top-level binding may refer to a CAF by referring to a
top-level definition with CAFs. A top-level definition may have CAFs if
it's updatable. At this point (because this is done after TidyPgm)
top-level Ids (whether imported or defined in this module) are
GlobalIds, so the top-levelness test is easy. (see also comments in the
code)
Reviewers: bgamari, simonpj, austin
Reviewed By: simonpj
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D1889
GHC Trac Issues: #11550 | 
| | 
| 
| 
| | This reverts commit 4f9967aa3d1f7cfd539d0c173cafac0fe290e26f. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Also remove the functions and types that became useless after removing
the fields:
- SRT functions
- LiveInfo type and functions
- freeVarsToLiveVars
- unariseLives and unariseSRT
Reviewers: bgamari, simonpj, austin
Reviewed By: simonpj
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D1880 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Since GHC 8.1/8.2 only needs to be bootstrap-able by GHC 7.10 and
GHC 8.0 (and GHC 8.2), we can now finally drop all that pre-AMP
compatibility CPP-mess for good!
Reviewers: austin, goldfire, bgamari
Subscribers: goldfire, thomie, erikd
Differential Revision: https://phabricator.haskell.org/D1724 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | The GranSim code was removed in dd56e9ab and 297b05a9 in 2009, and perhaps
other commits I couldn't find.
Reviewed By: austin
Differential Revision: https://phabricator.haskell.org/D737 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary: It looks like during .lhs -> .hs switch the comments were not updated. So doing exactly that.
Reviewers: austin, jstolarek, hvr, goldfire
Reviewed By: austin, jstolarek
Subscribers: thomie, goldfire
Differential Revision: https://phabricator.haskell.org/D621
GHC Trac Issues: #9986 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This patch solves the scoping problem of CmmTick nodes: If we just put
CmmTicks into blocks we have no idea what exactly they are meant to
cover.  Here we introduce tick scopes, which allow us to create
sub-scopes and merged scopes easily.
Notes:
* Given that the code often passes Cmm around "head-less", we have to
  make sure that its intended scope does not get lost. To keep the amount
  of passing-around to a minimum we define a CmmAGraphScoped type synonym
  here that just bundles the scope with a portion of Cmm to be assembled
  later.
* We introduce new scopes at somewhat random places, aligning with
  getCode calls. This works surprisingly well, but we might have to
  add new scopes into the mix later on if we find things too be too
  coarse-grained.
(From Phabricator D169) | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This is basically just about continuing maintaining source notes after
the Core stage. Unfortunately, this is more involved as it might seem,
as there are more restrictions on where ticks are allowed to show up.
Notes:
* We replace the StgTick / StgSCC constructors with a unified StgTick
  that can carry any tickish.
* For handling constructor or lambda applications, we generally float
  ticks out.
* Note that thanks to the NonLam placement, we know that source notes
  can never appear on lambdas. This means that as long as we are
  careful to always use mkTick, we will never violate CorePrep
  invariants.
* This is however not automatically true for eta expansion, which
  needs to somewhat awkwardly strip, then re-tick the expression in
  question.
* Where CorePrep floats out lets, we make sure to wrap them in the
  same spirit as FloatOut.
* Detecting selector thunks becomes a bit more involved, as we can run
  into ticks at multiple points.
(From Phabricator D169) | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | This reverts commit b23ba2a7d612c6b466521399b33fe9aacf5c4f75.
Conflicts:
	compiler/cmm/PprCmmDecl.hs
	compiler/nativeGen/PPC/Ppr.hs
	compiler/nativeGen/SPARC/Ppr.hs
	compiler/nativeGen/X86/Ppr.hs | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
The primary reason for doing this is assisting debuggability:
if static closures are all in the same section, they are
guaranteed to be adjacent to one another.  This will help
later when we add some code that takes section start/end and
uses this to sanity-check the sections.
Part of remove HEAP_ALLOCED patch set (#8199)
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
Test Plan: validate
Reviewers: simonmar, austin
Subscribers: simonmar, ezyang, carter, thomie
Differential Revision: https://phabricator.haskell.org/D263
GHC Trac Issues: #8199 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
This includes pretty much all the changes needed to make `Applicative`
a superclass of `Monad` finally. There's mostly reshuffling in the
interests of avoid orphans and boot files, but luckily we can resolve
all of them, pretty much. The only catch was that
Alternative/MonadPlus also had to go into Prelude to avoid this.
As a result, we must update the hsc2hs and haddock submodules.
Signed-off-by: Austin Seipp <austin@well-typed.com>
Test Plan: Build things, they might not explode horribly.
Reviewers: hvr, simonmar
Subscribers: simonmar
Differential Revision: https://phabricator.haskell.org/D13 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | In some cases, the layout of the LANGUAGE/OPTIONS_GHC lines has been
reorganized, while following the convention, to
- place `{-# LANGUAGE #-}` pragmas at the top of the source file, before
  any `{-# OPTIONS_GHC #-}`-lines.
- Moreover, if the list of language extensions fit into a single
  `{-# LANGUAGE ... -#}`-line (shorter than 80 characters), keep it on one
  line. Otherwise split into `{-# LANGUAGE ... -#}`-lines for each
  individual language extension. In both cases, try to keep the
  enumeration alphabetically ordered.
  (The latter layout is preferable as it's more diff-friendly)
While at it, this also replaces obsolete `{-# OPTIONS ... #-}` pragma
occurences by `{-# OPTIONS_GHC ... #-}` pragmas. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | These array types are smaller than Array# and MutableArray# and are
faster when the array size is small, as they don't have the overhead
of a card table. Having no card table reduces the closure size with 2
words in the typical small array case and leads to less work when
updating or GC:ing the array.
Reduces both the runtime and memory allocation by 8.8% on my insert
benchmark for the HashMap type in the unordered-containers package,
which makes use of lots of small arrays. With tuned GC settings
(i.e. `+RTS -A6M`) the runtime reduction is 15%.
Fixes #8923. | 
| | 
| 
| 
| 
| | I'd like to be able to pack together non-pointer fields that are less
than a word in size, and this is a necessary prerequisite. | 
| | |  | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Fixes #8585
When emmiting label of a self-recursive tail call (ie. when
performing loopification optimization) we emit the loop header
label after a stack check but before the heap check. The reason is
that tail-recursive functions use constant amount of stack space
so we don't need to repeat the check in every loop. But they can
grow the heap so heap check must be repeated in every call.
See Note [Self-recursive tail calls] and [Self-recursive loop header]. | 
| | |  | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | We now do the allocation of the blackhole indirection closure inside the
RTS procedure 'newCAF' instead of generating the allocation code inline
in the closure body of each CAF.  This slightly decreases code size in
modules with a lot of CAFs.
As a result of this change, for example, the size of DynFlags.o drops by
~60KB and HsExpr.o by ~100KB. | 
| | |  | 
| | |  | 
| | |  | 
| | |  | 
| | 
| 
| 
| | Signed-off-by: Edward Z. Yang <ezyang@mit.edu> | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This patch implements loopification optimization. It was described
in "Low-level code optimisations in the Glasgow Haskell Compiler" by
Krzysztof Woś, but we use a different approach here. Krzysztof's
approach was to perform optimization as a Cmm-to-Cmm pass. Our
approach is to generate properly optimized tail calls in the code
generator, which saves us the trouble of processing Cmm. This idea
was proposed by Simon Marlow. Implementation details are explained
in Note [Self-recursive tail calls].
Performance of most nofib benchmarks is not affected. There are
some benchmarks that show 5-7% improvement, with an average improvement
of 2.6%. It would require some further investigation to check if this
is related to benchamrking noise or does this optimization really
help make some class of programs faster.
As a minor cleanup, this patch renames forkProc to forkLneBody.
It also moves some data declarations from StgCmmMonad to
StgCmmClosure, because they are needed there and it seems that
StgCmmClosure is on top of the whole StgCmm* hierarchy. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Previosly logic of these functions was sth like this:
  cgIdApp x = case x of
                A -> cgLneJump x
                _ -> cgTailCall x
  cgTailCall x = case x of
                   B -> ...
                   C -> ...
                   _ -> ...
After merging there is no nesting of cases:
  cgIdApp x = case x of
                A -> -- body of cgLneJump
                B -> ...
                C -> ...
                _ -> ... | 
| | 
| 
| 
| 
| 
| | This commit removes module StgCmmGran which has only no-op functions.
According to comments in the module, it was used by GpH, but GpH
project seems to be dead for a couple of years now. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This cleanup includes:
  * removing dead code. This includes forkStatics function,
    which was in fact one big noop, and global bindings in
    CgInfoDownwards,
  * converting functions that used FCode monad only to
    access DynFlags into functions that take DynFlags
    as a parameter and don't work in a monad,
  * addBindC function is now smarter. It extracts Id from
    CgIdInfo passed to it in the same way addBindsC does.
    Previously this was done at every call site, which was
    redundant. | 
| | 
| 
| 
| 
| | A major cleanup of trailing whitespaces and tabs in codeGen/
directory. I also adjusted code formatting in some places. | 
| | |  |