|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| | 
| 
| 
| 
| | This moves all URL references to Trac tickets to their corresponding
GitLab counterparts. | 
| | 
| 
| 
| | Also remove unused arg from get_Regtable_addr_from_offset | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | The splitter is an evil Perl script that processes assembler code.
Its job can be done better by the linker's --gc-sections flag. GHC
passes this flag to the linker whenever -split-sections is passed on
the command line.
This is based on @DemiMarie's D2768.
Fixes Trac #11315
Fixes Trac #9832
Fixes Trac #8964
Fixes Trac #8685
Fixes Trac #8629 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | The big payload of this patch is:
  Add an AnonArgFlag to the FunTy constructor
  of Type, so that
    (FunTy VisArg   t1 t2) means (t1 -> t2)
    (FunTy InvisArg t1 t2) means (t1 => t2)
The big payoff is that we have a simple, local test to make
when decomposing a type, leading to many fewer calls to
isPredTy. To me the code seems a lot tidier, and probably
more efficient (isPredTy has to take the kind of the type).
See Note [Function types] in TyCoRep.
There are lots of consequences
* I made FunTy into a record, so that it'll be easier
  when we add a linearity field, something that is coming
  down the road.
* Lots of code gets touched in a routine way, simply because it
  pattern matches on FunTy.
* I wanted to make a pattern synonym for (FunTy2 arg res), which
  picks out just the argument and result type from the record. But
  alas the pattern-match overlap checker has a heart attack, and
  either reports false positives, or takes too long.  In the end
  I gave up on pattern synonyms.
  There's some commented-out code in TyCoRep that shows what I
  wanted to do.
* Much more clarity about predicate types, constraint types
  and (in particular) equality constraints in kinds.  See TyCoRep
  Note [Types for coercions, predicates, and evidence]
  and Note [Constraints in kinds].
  This made me realise that we need an AnonArgFlag on
  AnonTCB in a TyConBinder, something that was really plain
  wrong before. See TyCon Note [AnonTCB InivsArg]
* When building function types we must know whether we
  need VisArg (mkVisFunTy) or InvisArg (mkInvisFunTy).
  This turned out to be pretty easy in practice.
* Pretty-printing of types, esp in IfaceType, gets
  tidier, because we were already recording the (->)
  vs (=>) distinction in an ad-hoc way.  Death to
  IfaceFunTy.
* mkLamType needs to keep track of whether it is building
  (t1 -> t2) or (t1 => t2).  See Type
  Note [mkLamType: dictionary arguments]
Other minor stuff
* Some tidy-up in validity checking involving constraints;
  Trac #16263 | 
| | 
| 
| 
| | Also used ByteString in some other relevant places | 
| | 
| 
| 
| 
| 
| 
| | Support for Mac OS X on PowerPC has been dropped by Apple years ago. We
follow suit and remove PowerPC support for Darwin.
Fixes #16106. | 
| | |  | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Handle Int*QuotRemOP and Word*QuotRemOp in PPC NCG.
Refactor common code with remainder operation.
Test Plan: validate (I validated on Linux powerpc64le and x86_64)
Reviewers: erikd, hvr, bgamari, simonmar
Reviewed By: bgamari
Subscribers: rwbarton, carter
Differential Revision: https://phabricator.haskell.org/D5323 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
Currently, an AP thunk like `sat = f a b c` will not have its own entry
point and info pointer and will instead reuse a generic apply thunk
like `stg_ap_4_upd`.
That's great from a code size perspective, but if `f` is a known
function, a specialised entry point with a plain call can be much faster
than figuring out the arity and doing dynamic dispatch.
This looks at `f`s arity to figure out if it is a known function and if so, it
will not lower it to a generic apply function.
Benchmark results are encouraging: No changes to allocation, but 0.2% less
counted instructions.
Test Plan: Validates locally
Reviewers: simonmar, osa1, simonpj, bgamari
Reviewed By: simonpj
Subscribers: rwbarton, carter
GHC Trac Issues: #16005
Differential Revision: https://phabricator.haskell.org/D5414 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
Previously, the logic that checks whether a thunk has a counter or not
was duplicated in multiple functions.
This led to thunk enters being accounted to their enclosing functions in
`StgCmmTicky.tickyEnterThunk`, because the outer call to
`withNewTickyCounterThunk` didn't set the counter label for the thunk.
And rightly so! `tickyEnterThunk` should only account thunk enters to a
counter if `-ticky-dyn-thunk` is on.
This patch extracts the logic that was already present in its most
general form in `withNewTickyCounterThunk` into its own functions and
lets all other call sites checking for `-ticky-dyn-thunk` call this new
function named `thunkHasCounter` instead.
Reviewers: bgamari, simonmar
Reviewed By: simonmar
Subscribers: rwbarton, carter
Differential Revision: https://phabricator.haskell.org/D5392 | 
| | |  | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | After reverting Phab:D5358, Simon (Peyton Jones) asked for a Note
summarising why we want to keep the dead case binder check in `cgCase`.
Summary from mail conversation:
* Phab:D5324 means that we no longer /recompute/ dead-ness of case-binders in
  STG-land
* But TidyPgm preserves dead-ness info (see CoreTidy.tidyIdBndr)
* And so we can take advantage of it to avoid a redundant load. This load
  would be eliminated by CmmSink, but that only happens with -O | 
| | 
| 
| 
| 
| 
| | This reverts commit d13b7d60650cb84af11ee15b3f51c3511548cfdb.
(See discussion in D5358) | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
This implements a selective lambda-lifting pass late in the STG
pipeline.
Lambda lifting has the effect of avoiding closure allocation at the cost
of having to make former free vars available at call sites, possibly
enlarging closures surrounding call sites in turn.
We identify beneficial cases by means of an analysis that estimates
closure growth.
There's a Wiki page at
https://ghc.haskell.org/trac/ghc/wiki/LateLamLift.
Reviewers: simonpj, bgamari, simonmar
Reviewed By: simonpj
Subscribers: rwbarton, carter
GHC Trac Issues: #9476
Differential Revision: https://phabricator.haskell.org/D5224 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This patch fixes a fairly long-standing bug (dating back to 2015) in
RdrName.bestImport, namely
   commit 9376249b6b78610db055a10d05f6592d6bbbea2f
   Author: Simon Peyton Jones <simonpj@microsoft.com>
   Date:   Wed Oct 28 17:16:55 2015 +0000
   Fix unused-import stuff in a better way
In that patch got the sense of the comparison back to front, and
thereby failed to implement the unused-import rules described in
  Note [Choosing the best import declaration] in RdrName
This led to Trac #13064 and #15393
Fixing this bug revealed a bunch of unused imports in libraries;
the ones in the GHC repo are part of this commit.
The two important changes are
* Fix the bug in bestImport
* Modified the rules by adding (a) in
     Note [Choosing the best import declaration] in RdrName
  Reason: the previosu rules made Trac #5211 go bad again.  And
  the new rule (a) makes sense to me.
In unravalling this I also ended up doing a few other things
* Refactor RnNames.ImportDeclUsage to use a [GlobalRdrElt] for the
  things that are used, rather than [AvailInfo]. This is simpler
  and more direct.
* Rename greParentName to greParent_maybe, to follow GHC
  naming conventions
* Delete dead code RdrName.greUsedRdrName
Bumps a few submodules.
Reviewers: hvr, goldfire, bgamari, simonmar, jrtc27
Subscribers: rwbarton, carter
Differential Revision: https://phabricator.haskell.org/D5312 | 
| | |  | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | In a previous patch we replaced some built-in literal constructors
(MachInt, MachWord, etc.) with a single LitNumber constructor.
In this patch we replace the `Mach` prefix of the remaining constructors
with `Lit` for consistency (e.g., LitChar, LitLabel, etc.).
Sadly the name `LitString` was already taken for a kind of FastString
and it would become misleading to have both `LitStr` (literal
constructor renamed after `MachStr`) and `LitString` (FastString
variant). Hence this patch renames the FastString variant `PtrString`
(which is more accurate) and the literal string constructor now uses the
least surprising `LitString` name.
Both `Literal` and `LitString/PtrString` have recently seen breaking
changes so doing this kind of renaming now shouldn't harm much.
Reviewers: hvr, goldfire, bgamari, simonmar, jrtc27, tdammers
Subscribers: tdammers, rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4881 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | D5339 (part of D5324) removed the dead case binder analysis done during
CoreToStg so this condition always holds now.
Test Plan: Validated locally.
Reviewers: sgraf, bgamari, simonmar
Subscribers: rwbarton, carter
Differential Revision: https://phabricator.haskell.org/D5358 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
Currently, `CoreToStg` annotates `StgRhsClosure`s with their set of non-global
free variables.  This free variable information is only needed in the final
code generation step (i.e. `StgCmm.codeGen`), which leads to transformations
such as `StgCse` and `StgUnarise` having to maintain this information.
This is tiresome and unnecessary, so this patch introduces a trees-to-grow-like
approach that only introduces the free variable set into the syntax tree in the
code gen pass, along with a free variable analysis on STG terms to generate
that information.
Fixes #15754.
Reviewers: simonpj, osa1, bgamari, simonmar
Reviewed By: osa1
Subscribers: rwbarton, carter
GHC Trac Issues: #15754
Differential Revision: https://phabricator.haskell.org/D5324 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This builds off of D4475.
Bumps binary submodule.
Reviewers: carter, AndreasK, hvr, goldfire, bgamari, simonmar
Subscribers: rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D5006 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | - The StgBinderInfo type was never used in the code gen, so the type, related
  computation in CoreToStg, and some comments about it are removed. See #15770
  for more details.
- Simplified CoreToStg after removing the StgBinderInfo computation: removed
  StgBinderInfo arguments and mfix stuff.
The StgBinderInfo values were not used in the code gen, but I still run nofib
just to make sure: 0.0% change in allocations and binary sizes.
Test Plan: Validated locally
Reviewers: simonpj, simonmar, bgamari, sgraf
Reviewed By: sgraf
Subscribers: AndreasK, sgraf, rwbarton, carter
Differential Revision: https://phabricator.haskell.org/D5232 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This fixes two isssues:
- Using bitcast for MO_XX_Conv
  Arguments to a bitcast must be of the same size. We should be using
  `trunc` and `zext` instead.
- Using unsupported MO_*_QuotRem for LLVM
  The two primops `MO_*_QuotRem` are not supported by the LLVM backend,
so
  we shouldn't use them for `Int8#`/`Word8#` (just as we do not use
them for
  `Int#`/`Word#`).
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: manually run tests with WAY=llvm
Reviewers: bgamari, simonmar
Reviewed By: bgamari
Subscribers: rwbarton, carter
GHC Trac Issues: #15864
Differential Revision: https://phabricator.haskell.org/D5304 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This is the first step of implementing:
https://github.com/ghc-proposals/ghc-proposals/pull/74
The main highlights/changes:
    primops.txt.pp gets two new sections for two new primitive types for
    signed and unsigned 8-bit integers (Int8# and Word8 respectively) along
    with basic arithmetic and comparison operations. PrimRep/RuntimeRep get
    two new constructors for them. All of the primops translate into the
    existing MachOPs.
    For CmmCalls the codegen will now zero-extend the values at call
    site (so that they can be moved to the right register) and then truncate
    them back their original width.
    x86 native codegen needed some updates, since it wasn't able to deal
    with the new widths, but all the changes are quite localized. LLVM
    backend seems to just work.
This is the second attempt at merging this, after the first attempt in
D4475 had to be backed out due to regressions on i386.
Bumps binary submodule.
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: ./validate (on both x86-{32,64})
Reviewers: bgamari, hvr, goldfire, simonmar
Subscribers: rwbarton, carter
Differential Revision: https://phabricator.haskell.org/D5258 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Change some URLs from hackage.haskell.org/trac to ghc.haskell.org/trac
Test Plan: manually verify links work
Reviewers: bgamari, simonmar, mpickering
Reviewed By: bgamari, mpickering
Subscribers: mpickering, rwbarton, carter
GHC Trac Issues: #15733
Differential Revision: https://phabricator.haskell.org/D5257 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
Trac #9279 reminded us that the worker wrapper transformation copes
really badly with absent unlifted boxed bindings.
As `Note [Absent errors]` in WwLib.hs points out, we can't just use
`absentError` for unlifted bindings because there is no bottom to hide
the error in.
So instead, we synthesise a new `RubbishLit` of type
`forall (a :: TYPE 'UnliftedRep). a`, which code-gen may subsitute for
any boxed value. We choose `()`, so that there is a good chance that
the program crashes instead instead of leading to corrupt data, should
absence analysis have been too optimistic (#11126).
Reviewers: simonpj, hvr, goldfire, bgamari, simonmar
Reviewed By: simonpj
Subscribers: osa1, rwbarton, carter
GHC Trac Issues: #15627, #9279, #4306, #11126
Differential Revision: https://phabricator.haskell.org/D5153 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | See #15696 for more details. We now always enter dataToTag# argument (done in
generated Cmm, in StgCmmExpr). Any high-level optimisations on dataToTag#
applications are done by the simplifier. Looking at tag bits (instead of
reading the info table) for small types is left to another diff.
Incorrect test T14626 is removed. We no longer do this optimisation (see
comment:44, comment:45, comment:60).
Comments and notes about special cases around dataToTag# are removed. We no
longer have any special cases around it in Core.
Other changes related to evaluating primops (seq# and dataToTag#) will be
pursued in follow-up diffs.
Test Plan: Validates with three regression tests
Reviewers: simonpj, simonmar, hvr, bgamari, dfeuer
Reviewed By: simonmar
Subscribers: rwbarton, carter
GHC Trac Issues: #15696
Differential Revision: https://phabricator.haskell.org/D5201 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | This unfortunately broke i386 support since it introduced references to
byte-sized registers that don't exist on that architecture.
Reverts binary submodule
This reverts commit 5d5307f943d7581d7013ffe20af22233273fba06. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This is the first step of implementing:
https://github.com/ghc-proposals/ghc-proposals/pull/74
The main highlights/changes:
- `primops.txt.pp` gets two new sections for two new primitive types
  for signed and unsigned 8-bit integers (`Int8#` and `Word8`
  respectively) along with basic arithmetic and comparison
  operations. `PrimRep`/`RuntimeRep` get two new constructors for
  them. All of the primops translate into the existing `MachOP`s.
- For `CmmCall`s the codegen will now zero-extend the values at call
  site (so that they can be moved to the right register) and then
  truncate them back their original width.
- x86 native codegen needed some updates, since it wasn't able to deal
  with the new widths, but all the changes are quite localized. LLVM
  backend seems to just work.
Bumps binary submodule.
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: ./validate with new tests
Reviewers: hvr, goldfire, bgamari, simonmar
Subscribers: Abhiroop, dfeuer, rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4475 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Write barriers for large array writes were added in D2525, as a part of #12469.
However it seems we forgot about small arrays. This patch adds the same write
barrier to small array writes.
Reviewers: simonmar, bgamari
Reviewed By: simonmar
Subscribers: rwbarton, carter
GHC Trac Issues: #12469
Differential Revision: https://phabricator.haskell.org/D5209 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Long ago, the stable name table and stable pointer tables were one.
Now, they are separate, and have significantly different
implementations. I believe the time has come to finish the split
that began in #7674.
* Divide `rts/Stable` into `rts/StableName` and `rts/StablePtr`.
* Give each table its own mutex.
* Add FFI functions `hs_lock_stable_ptr_table` and
`hs_unlock_stable_ptr_table` and document them.
  These are intended to replace the previously undocumented
`hs_lock_stable_tables` and `hs_lock_stable_tables`,
  which are now documented as deprecated synonyms.
* Make `eqStableName#` use pointer equality instead of unnecessarily
comparing stable name table indices.
Reviewers: simonmar, bgamari, erikd
Reviewed By: bgamari
Subscribers: rwbarton, carter
GHC Trac Issues: #15555
Differential Revision: https://phabricator.haskell.org/D5084 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | Reviewers: hvr, bgamari, simonmar, jrtc27
Reviewed By: bgamari
Subscribers: alpmestan, rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D5034 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This patch adds foldl' to GhcPrelude and changes must occurences
of foldl to foldl'. This leads to better performance especially
for quick builds where GHC does not perform strictness analysis.
It does change strictness behaviour when we use foldl' to turn
a argument list into function applications. But this is only a
drawback if code looks ONLY at the last argument but not at the first.
And as the benchmarks show leads to fewer allocations in practice
at O2.
Compiler performance for Nofib:
O2 Allocations:
        -1 s.d.                -----            -0.0%
        +1 s.d.                -----            -0.0%
        Average                -----            -0.0%
O2 Compile Time:
        -1 s.d.                -----            -2.8%
        +1 s.d.                -----            +1.3%
        Average                -----            -0.8%
O0 Allocations:
        -1 s.d.                -----            -0.2%
        +1 s.d.                -----            -0.1%
        Average                -----            -0.2%
Test Plan: ci
Reviewers: goldfire, bgamari, simonmar, tdammers, monoidal
Reviewed By: bgamari, monoidal
Subscribers: tdammers, rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4929 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
This contains two commits:
----
Make GHC's code-base compatible w/ `MonadFail`
There were a couple of use-sites which implicitly used pattern-matches
in `do`-notation even though the underlying `Monad` didn't explicitly
support `fail`
This refactoring turns those use-sites into explicit case
discrimations and adds an `MonadFail` instance for `UniqSM`
(`UniqSM` was the worst offender so this has been postponed for a
follow-up refactoring)
---
Turn on MonadFail desugaring by default
This finally implements the phase scheduled for GHC 8.6 according to
https://prime.haskell.org/wiki/Libraries/Proposals/MonadFail#Transitionalstrategy
This also preserves some tests that assumed MonadFail desugaring to be
active; all ghc boot libs were already made compatible with this
`MonadFail` long ago, so no changes were needed there.
Test Plan: Locally performed ./validate --fast
Reviewers: bgamari, simonmar, jrtc27, RyanGlScott
Reviewed By: bgamari
Subscribers: bgamari, RyanGlScott, rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D5028 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Add support for built-in Natural literals in Core.
- Replace MachInt,MachWord, LitInteger, etc. with a single LitNumber
  constructor with a LitNumType field
- Support built-in Natural literals
- Add desugar warning for negative literals
- Move Maybe(..) from GHC.Base to GHC.Maybe for module dependency
  reasons
This patch introduces only a few rules for Natural literals (compared
to Integer's rules). Factorization of the built-in rules for numeric
literals will be done in another patch as this one is already big to
review.
Test Plan:
  validate
  test build with integer-simple
Reviewers: hvr, bgamari, goldfire, Bodigrim, simonmar
Reviewed By: bgamari
Subscribers: phadej, simonpj, RyanGlScott, carter, hsyl20, rwbarton,
thomie
GHC Trac Issues: #14170, #14465
Differential Revision: https://phabricator.haskell.org/D4212 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | SMALL_MUT_ARR_PTRS_FROZEN0 -> SMALL_MUT_ARR_PTRS_FROZEN_DIRTY
SMALL_MUT_ARR_PTRS_FROZEN  -> SMALL_MUT_ARR_PTRS_FROZEN_CLEAN
MUT_ARR_PTRS_FROZEN0       -> MUT_ARR_PTRS_FROZEN_DIRTY
MUT_ARR_PTRS_FROZEN        -> MUT_ARR_PTRS_FROZEN_CLEAN
Naming is now consistent with other CLEAR/DIRTY objects (MVAR, MUT_VAR,
MUT_ARR_PTRS).
(alternatively we could rename MVAR_DIRTY/MVAR_CLEAN etc. to MVAR0/MVAR)
Removed a few comments in Scav.c about FROZEN0 being on the mut_list
because it's now clear from the closure type.
Reviewers: bgamari, simonmar, erikd
Reviewed By: simonmar
Subscribers: rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4784 | 
| | |  | 
| | 
| 
| 
| | Addressing review comments on D4637 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
The idea here is to save a little code size and some work in the GC,
by collapsing FUN_STATIC closures and their SRTs.
This is (4) in a series; see D4632 for more details.
There's a tradeoff here: more complexity in the compiler in exchange
for a modest code size reduction (probably around 0.5%).
Results:
* GHC binary itself (statically linked) is 1% smaller
* -0.2% binary sizes in nofib (-0.5% module sizes)
Full nofib results comparing D4634 with this: P177 (ignore runtimes,
these aren't stable on my laptop)
Test Plan: validate, nofib
Reviewers: bgamari, niteria, simonpj, erikd
Subscribers: thomie, carter
Differential Revision: https://phabricator.haskell.org/D4637 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
- Previously we would hvae a single big table of pointers per module,
  with a set of bitmaps to reference entries within it. The new
  representation is identical to a static constructor, which is much
  simpler for the GC to traverse, and we get to remove the complicated
  bitmap-traversal code from the GC.
- Rewrite all the code to generate SRTs in CmmBuildInfoTables, and
  document it much better (see Note [SRTs]). This has been something
  I've wanted to do since we moved to the new code generator, I
  finally had the opportunity to finish it while on a transatlantic
  flight recently :)
There are a series of 4 diffs:
1. D4632 (this one), which does the bulk of the changes
2. D4633 which adds support for smaller `CmmLabelDiffOff` constants
3. D4634 which takes advantage of D4632 and D4633 to save a word in
   info tables that have an SRT on x86_64. This is where most of the
   binary size improvement comes from.
4. D4637 which makes a further optimisation to merge some SRTs with
   static FUN closures.  This adds some complexity and the benefits
   are fairly modest, so it's not clear yet whether we should do this.
Results (after (3), on x86_64)
- GHC itself (staticaly linked) is 5.2% smaller
- -1.7% binary sizes in nofib, -2.9% module sizes. Full nofib results: P176
- I measured the overhead of traversing all the static objects in a
  major GC in GHC itself by doing `replicateM_ 1000 performGC` as the
  first thing in `Main.main`.  The new version was 5-10% faster, but
  the results did vary quite a bit.
- I'm not sure if there's a compile-time difference, the results are
  too unreliable.
Test Plan: validate
Reviewers: bgamari, michalt, niteria, simonpj, erikd, osa1
Subscribers: thomie, carter
Differential Revision: https://phabricator.haskell.org/D4632 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This is mostly for congruence with 'subWordC#' and '{add,sub}IntC#'.
I found 'plusWord2#' while implementing this, which both lacks
documentation and has a slightly different specification than
'addWordC#', which means the generic implementation is unnecessarily
complex.
While I was at it, I also added lacking meta-information on PrimOps
and refactored 'subWordC#'s generic implementation to be branchless.
Reviewers: bgamari, simonmar, jrtc27, dfeuer
Reviewed By: bgamari, dfeuer
Subscribers: dfeuer, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4592 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Reviewers: bgamari, simonmar
Reviewed By: bgamari
Subscribers: dfeuer, rwbarton, thomie, carter
GHC Trac Issues: #4442
Differential Revision: https://phabricator.haskell.org/D4488 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| | This subtle patch fixes Trac #5129 (again; comment:20
and following).
I took the opportunity to document seq# properly; see
Note [seq# magic] in PrelRules, and Note [seq# and expr_ok]
in CoreUtils. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Summary:
get/setAllocationCounter didn't take into account allocations in the
current block. This was known at the time, but it turns out to be
important to have more accuracy when using these in a fine-grained
way.
Test Plan:
New unit test to test incrementally larger allocaitons.  Before I got
results like this:
```
+0
+0
+0
+0
+0
+4096
+0
+0
+0
+0
+0
+4064
+0
+0
+4088
+4056
+0
+0
+0
+4088
+4096
+4056
+4096
```
Notice how the results aren't always monotonically increasing.  After
this patch:
```
+344
+416
+488
+560
+632
+704
+776
+848
+920
+992
+1064
+1136
+1208
+1280
+1352
+1424
+1496
+1568
+1640
+1712
+1784
+1856
+1928
+2000
+2072
+2144
```
Reviewers: hvr, erikd, simonmar, jrtc27, trommler
Reviewed By: simonmar
Subscribers: trommler, jrtc27, rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4363 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This removes a bunch of unnecessary includes of `HsVersions.h` along
with unnecessary CPP (e.g., due to checking for DEBUG which can be
achieved by looking at `debugIsOn`)
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: ./validate
Reviewers: bgamari, simonmar
Reviewed By: bgamari
Subscribers: rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4462 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This was broken by D3746 and/or D3809, but unfortunately we didn't
notice because CI at the time wasn't building the profiling way.
Test Plan:
```
cd testsuite/test/profiling/should_run
make WAY=ghci-ext-prof
```
Reviewers: bgamari, michalt, hvr, erikd
Subscribers: rwbarton, thomie, carter
GHC Trac Issues: #14705
Differential Revision: https://phabricator.haskell.org/D4437 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | The pattern `threadCapability =<< myThreadId` is used a lot in code
that uses `hs_try_putmvar`, I want to make it cheaper.
Test Plan: validate
Reviewers: bgamari, erikd
Reviewed By: bgamari
Subscribers: rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4381 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Test Plan: validate
Reviewers: bgamari, erikd
Reviewed By: bgamari
Subscribers: rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4380 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This is an obvious optimisation whose overhead is neglectable but
which significantly simplifies the common uses of `compareByteArrays#`
which would otherwise require to make *careful* use of
`reallyUnsafePtrEquality#` or (equally fragile) `byteArrayContents#`
which can result in less optimal assembler code being generated.
Test Plan: carefully examined generated cmm/asm code; validate via phab
Reviewers: alexbiehl, bgamari, simonmar
Reviewed By: bgamari, simonmar
Subscribers: rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4319 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This adds support for the bit deposit and extraction operations provided
by the BMI and BMI2 instruction set extensions on modern amd64 machines.
Implement x86 code generator for pdep and pext.  Properly initialise
bmiVersion field.
pdep and pext test cases
Fix pattern match for pdep and pext instructions
Fix build of pdep and pext code for 32-bit architectures
Test Plan: Validate
Reviewers: austin, simonmar, bgamari, angerman
Reviewed By: bgamari
Subscribers: trommler, carter, angerman, thomie, rwbarton, newhoggy
GHC Trac Issues: #14206
Differential Revision: https://phabricator.haskell.org/D4236 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | Reviewers: bgamari, simonmar
Reviewed By: simonmar
Subscribers: alexbiehl, rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4309 |