summaryrefslogtreecommitdiff
path: root/compiler/nativeGen/BlockLayout.hs
Commit message (Collapse)AuthorAgeFilesLines
* Modules: CmmToAsm (#13009)Sylvain Henry2020-02-241-895/+0
|
* Modules: Driver (#13009)Sylvain Henry2020-02-211-1/+1
| | | | submodule updates: nofib, haddock
* Module hierarchy: Cmm (cf #13009)Sylvain Henry2020-01-251-5/+5
|
* PmCheck: Formulate as translation between Clause TreesSebastian Graf2020-01-251-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We used to check `GrdVec`s arising from multiple clauses and guards in isolation. That resulted in a split between `pmCheck` and `pmCheckGuards`, the implementations of which were similar, but subtly different in detail. Also the throttling mechanism described in `Note [Countering exponential blowup]` ultimately got quite complicated because it had to cater for both checking functions. This patch realises that pattern match checking doesn't just consider single guarded RHSs, but that it's always a whole set of clauses, each of which can have multiple guarded RHSs in turn. We do so by translating a list of `Match`es to a `GrdTree`: ```haskell data GrdTree = Rhs !RhsInfo | Guard !PmGrd !GrdTree -- captures lef-to-right match semantics | Sequence !GrdTree !GrdTree -- captures top-to-bottom match semantics | Empty -- For -XEmptyCase, neutral element of Sequence ``` Then we have a function `checkGrdTree` that matches a given `GrdTree` against an incoming set of values, represented by `Deltas`: ```haskell checkGrdTree :: GrdTree -> Deltas -> CheckResult ... ``` Throttling is isolated to the `Sequence` case and becomes as easy as one would expect: When the union of uncovered values becomes too big, just return the original incoming `Deltas` instead (which is always a superset of the union, thus a sound approximation). The returned `CheckResult` contains two things: 1. The set of values that were not covered by any of the clauses, for exhaustivity warnings. 2. The `AnnotatedTree` that enriches the syntactic structure of the input program with divergence and inaccessibility information. This is `AnnotatedTree`: ```haskell data AnnotatedTree = AccessibleRhs !RhsInfo | InaccessibleRhs !RhsInfo | MayDiverge !AnnotatedTree | SequenceAnn !AnnotatedTree !AnnotatedTree | EmptyAnn ``` Crucially, `MayDiverge` asserts that the tree may force diverging values, so not all of its wrapped clauses can be redundant. While the set of uncovered values can be used to generate the missing equations for warning messages, redundant and proper inaccessible equations can be extracted from `AnnotatedTree` by `redundantAndInaccessibleRhss`. For this to work properly, the interface to the Oracle had to change. There's only `addPmCts` now, which takes a bag of `PmCt`s. There's a whole bunch of `PmCt` variants to replace the different oracle functions from before. The new `AnnotatedTree` structure allows for more accurate warning reporting (as evidenced by a number of changes spread throughout GHC's code base), thus we fix #17465. Fixes #17646 on the go. Metric Decrease: T11822 T9233 PmSeriesS haddock.compiler
* Fix more typos, via an improved Levenshtein-style correctorBrian Wignall2020-01-121-2/+2
|
* Fix typos, via a Levenshtein-style correctorBrian Wignall2020-01-041-1/+1
|
* Fix more typosBrian Wignall2019-12-021-1/+1
|
* Fix typos, using Wikipedia list of common typosBrian Wignall2019-11-281-1/+1
|
* Fix typosBrian Wignall2019-11-231-2/+1
|
* Add loop level analysis to the NCG backend.klebinger.andreas@gmx.at2019-10-161-248/+387
| | | | | | | | | | | | | | | | | | | | | | | | | | | For backends maintaining the CFG during codegen we can now find loops and their nesting level. This is based on the Cmm CFG and dominator analysis. As a result we can estimate edge frequencies a lot better for methods, resulting in far better code layout. Speedup on nofib: ~1.5% Increase in compile times: ~1.9% To make this feasible this commit adds: * Dominator analysis based on the Lengauer-Tarjan Algorithm. * An algorithm estimating global edge frequences from branch probabilities - In CFG.hs A few static branch prediction heuristics: * Expect to take the backedge in loops. * Expect to take the branch NOT exiting a loop. * Expect integer vs constant comparisons to be false. We also treat heap/stack checks special for branch prediction to avoid them being treated as loops.
* Fix #17334 where NCG did not properly update the CFG.wip/andreask/17334Andreas Klebinger2019-10-131-12/+10
| | | | | | | | | | | | | Statements can change the basic block in which instructions are placed during instruction selection. We have to keep track of this switch of the current basic block as we need this information in order to properly update the CFG. This commit implements this change and fixes #17334. We do so by having stmtToInstr return the new block id if a statement changed the basic block.
* Make the C-- O and C types constructors with DataKindsJohn Ericson2019-09-051-2/+5
| | | | | The tightens up the kinds a bit. I use type synnonyms to avoid adding promotion ticks everywhere.
* Remove redundant OPTIONS_GHC in BlockLayout.hsAndreas Klebinger2019-08-271-3/+0
|
* Remove unused imports of the form 'import foo ()' (Fixes #17065)James Foster2019-08-151-1/+0
| | | | | | | | | | | These kinds of imports are necessary in some cases such as importing instances of typeclasses or intentionally creating dependencies in the build system, but '-Wunused-imports' can't detect when they are no longer needed. This commit removes the unused ones currently in the code base (not including test files or submodules), with the hope that doing so may increase parallelism in the build system by removing unnecessary dependencies.
* Update Wiki URLs to point to GitLabTakenobu Tani2019-03-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | This moves all URL references to Trac Wiki to their corresponding GitLab counterparts. This substitution is classified as follows: 1. Automated substitution using sed with Ben's mapping rule [1] Old: ghc.haskell.org/trac/ghc/wiki/XxxYyy... New: gitlab.haskell.org/ghc/ghc/wikis/xxx-yyy... 2. Manual substitution for URLs containing `#` index Old: ghc.haskell.org/trac/ghc/wiki/XxxYyy...#Zzz New: gitlab.haskell.org/ghc/ghc/wikis/xxx-yyy...#zzz 3. Manual substitution for strings starting with `Commentary` Old: Commentary/XxxYyy... New: commentary/xxx-yyy... See also !539 [1]: https://gitlab.haskell.org/bgamari/gitlab-migration/blob/master/wiki-mapping.json
* Small optimizations to BlockLayout.klebinger.andreas@gmx.at2019-01-311-39/+31
| | | | | | | | | | * Remove `takeL/R 1` occurences by lastOL/headOL. * Make BlockChain a OrdList newtype by removing the set of blocks. Initially BlockChain contained both, a set for membership test and a ordered list of blocks. The set is not used for any performance sensitive lookups so we get rid of it.
* Replace BlockSequence with OrdList in BlockLayout.hsklebinger.andreas@gmx.at2019-01-311-76/+23
| | | | | OrdList does the same thing and more so there is no reason to have both.
* NCG: New code layout algorithm.Andreas Klebinger2018-11-171-0/+819
Summary: This patch implements a new code layout algorithm. It has been tested for x86 and is disabled on other platforms. Performance varies slightly be CPU/Machine but in general seems to be better by around 2%. Nofib shows only small differences of about +/- ~0.5% overall depending on flags/machine performance in other benchmarks improved significantly. Other benchmarks includes at least the benchmarks of: aeson, vector, megaparsec, attoparsec, containers, text and xeno. While the magnitude of gains differed three different CPUs where tested with all getting faster although to differing degrees. I tested: Sandy Bridge(Xeon), Haswell, Skylake * Library benchmark results summarized: * containers: ~1.5% faster * aeson: ~2% faster * megaparsec: ~2-5% faster * xml library benchmarks: 0.2%-1.1% faster * vector-benchmarks: 1-4% faster * text: 5.5% faster On average GHC compile times go down, as GHC compiled with the new layout is faster than the overhead introduced by using the new layout algorithm, Things this patch does: * Move code responsilbe for block layout in it's own module. * Move the NcgImpl Class into the NCGMonad module. * Extract a control flow graph from the input cmm. * Update this cfg to keep it in sync with changes during asm codegen. This has been tested on x64 but should work on x86. Other platforms still use the old codelayout. * Assign weights to the edges in the CFG based on type and limited static analysis which are then used for block layout. * Once we have the final code layout eliminate some redundant jumps. In particular turn a sequences of: jne .foo jmp .bar foo: into je bar foo: .. Test Plan: ci Reviewers: bgamari, jmct, jrtc27, simonmar, simonpj, RyanGlScott Reviewed By: RyanGlScott Subscribers: RyanGlScott, trommler, jmct, carter, thomie, rwbarton GHC Trac Issues: #15124 Differential Revision: https://phabricator.haskell.org/D4726