summaryrefslogtreecommitdiff
path: root/testsuite/tests/cmm
Commit message (Collapse)AuthorAgeFilesLines
* generalize GHC.Cmm.Dataflow to work over any node typeNorman Ramsey2021-12-072-0/+26
| | | | | | See #20725. The commit includes source-code changes and a test case.
* testsuite: Specify expected word-size of machop testsBen Gamari2021-12-021-7/+8
| | | | These generally expect a particular word size.
* CmmToC: Always cast arguments as unsignedBen Gamari2021-12-023-0/+5
| | | | | | | As noted in Note [When in doubt, cast arguments as unsigned], we must ensure that arguments have the correct signedness since some operations (e.g. `%`) have different semantics depending upon signedness.
* testsuite: Add testcases for various machop issuesBen Gamari2021-12-0211-0/+72
| | | | There were found by the test-primops testsuite.
* Don't include types in test outputAndreas Klebinger2021-11-233-40/+27
|
* CmmSink: Be more aggressive in removing no-op assignments.Andreas Klebinger2021-11-233-0/+68
| | | | | | | | | | No-op assignments like R1 = R1 are not only wasteful. They can also inhibit other optimizations like inlining assignments that read from R1. We now check for assignments being a no-op before and after we simplify the RHS in Cmm sink which should eliminate most of these no-ops.
* Make Word64 use Word64# on every architectureSylvain Henry2021-11-061-56/+15
|
* Test non-native switch C-- with twos complimentJohn Ericson2021-08-174-1/+40
| | | | | | | We don't want regressions like e8f7734d8a052f99b03e1123466dc9f47b48c311 to regress. Co-Authored-By: Sylvain Henry <hsyl20@gmail.com>
* testsuite: Add test for #20142Ben Gamari2021-07-232-0/+31
|
* Fix array and cleanup conversion primops (#19026)Sylvain Henry2021-03-031-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The first change makes the array ones use the proper fixed-size types, which also means that just like before, they can be used without explicit conversions with the boxed sized types. (Before, it was Int# / Word# on both sides, now it is fixed sized on both sides). For the second change, don't use "extend" or "narrow" in some of the user-facing primops names for conversions. - Names like `narrowInt32#` are misleading when `Int` is 32-bits. - Names like `extendInt64#` are flat-out wrong when `Int is 32-bits. - `narrow{Int,Word}<N>#` however map a type to itself, and so don't suffer from this problem. They are left as-is. These changes are batched together because Alex happend to use the array ops. We can only use released versions of Alex at this time, sadly, and I don't want to have to have a release thatwon't work for the final GHC 9.2. So by combining these we get all the changes for Alex done at once. Bump hackage state in a few places, and also make that workflow slightly easier for the future. Bump minimum Alex version Bump Cabal, array, bytestring, containers, text, and binary submodules
* NCG: Fix 64bit int comparisons on 32bit x86Andreas Klebinger2020-11-044-0/+207566
| | | | | | | | | | | We no compare these by doing 64bit subtraction and checking the resulting flags. We used to do this differently but the old approach was broken when the high bits compared equal and the comparison was one of >= or <=. The new approach should be both correct and faster.
* Add tests for #17920Sylvain Henry2020-06-231-1/+1
| | | | | | Metric Decrease: T12150 T12234
* GHC.Cmm.Opt: Handle MO_XX_ConvBen Gamari2020-05-152-0/+18
| | | | | | | | | This MachOp was introduced by 2c959a1894311e59cd2fd469c1967491c1e488f3 but a wildcard match in cmmMachOpFoldM hid the fact that it wasn't handled. Ideally we would eliminate the match but this appears to be a larger task. Fixes #18141.
* testsuite/T16930: Don't rely on gnu grep specific --includeBen Gamari2020-02-142-12/+12
| | | | In BSD grep this flag only affects directory recursion.
* Module hierarchy: Cmm (cf #13009)Sylvain Henry2020-01-251-4/+4
|
* Handle TagToEnum in the same big case as the other primopsJohn Ericson2020-01-162-0/+44
| | | | | | | | | | | | | Before, it was a panic because it was handled above. But there must have been an error in my reasoning (another caller?) because #17442 reported the panic was hit. But, rather than figuring out what happened, I can just make it impossible by construction. By adding just a bit more bureaucracy in the return types, I can handle TagToEnum in the same case as all the others, so the big case is is now total, and the panic is removed. Fixes #17442
* Make the C-- O and C types constructors with DataKindsJohn Ericson2019-09-051-1/+3
| | | | | The tightens up the kinds a bit. I use type synnonyms to avoid adding promotion ticks everywhere.
* Change behaviour of -ddump-cmm-verbose to dump each Cmm pass output to a ↵nineonine2019-07-264-0/+32
| | | | separate file and add -ddump-cmm-verbose-by-proc to keep old behaviour (#16930)
* testsuite: Use makefile_testBen Gamari2019-01-301-1/+1
| | | | | This eliminates most uses of run_command in the testsuite in favor of the more structured makefile_test.
* Revert "Batch merge"Ben Gamari2019-01-301-1/+1
| | | | This reverts commit 76c8fd674435a652c75a96c85abbf26f1f221876.
* Batch mergeBen Gamari2019-01-301-1/+1
|
* NCG: New code layout algorithm.Andreas Klebinger2018-11-173-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch implements a new code layout algorithm. It has been tested for x86 and is disabled on other platforms. Performance varies slightly be CPU/Machine but in general seems to be better by around 2%. Nofib shows only small differences of about +/- ~0.5% overall depending on flags/machine performance in other benchmarks improved significantly. Other benchmarks includes at least the benchmarks of: aeson, vector, megaparsec, attoparsec, containers, text and xeno. While the magnitude of gains differed three different CPUs where tested with all getting faster although to differing degrees. I tested: Sandy Bridge(Xeon), Haswell, Skylake * Library benchmark results summarized: * containers: ~1.5% faster * aeson: ~2% faster * megaparsec: ~2-5% faster * xml library benchmarks: 0.2%-1.1% faster * vector-benchmarks: 1-4% faster * text: 5.5% faster On average GHC compile times go down, as GHC compiled with the new layout is faster than the overhead introduced by using the new layout algorithm, Things this patch does: * Move code responsilbe for block layout in it's own module. * Move the NcgImpl Class into the NCGMonad module. * Extract a control flow graph from the input cmm. * Update this cfg to keep it in sync with changes during asm codegen. This has been tested on x64 but should work on x86. Other platforms still use the old codelayout. * Assign weights to the edges in the CFG based on type and limited static analysis which are then used for block layout. * Once we have the final code layout eliminate some redundant jumps. In particular turn a sequences of: jne .foo jmp .bar foo: into je bar foo: .. Test Plan: ci Reviewers: bgamari, jmct, jrtc27, simonmar, simonpj, RyanGlScott Reviewed By: RyanGlScott Subscribers: RyanGlScott, trommler, jmct, carter, thomie, rwbarton GHC Trac Issues: #15124 Differential Revision: https://phabricator.haskell.org/D4726
* Check if both branches of an Cmm if have the same target.klebinger.andreas@gmx.at2018-06-074-0/+16
| | | | | | | | | | | | | | | | | | | | This for some reason or the other and makes it into the final binary. I've added the check to ContFlowOpt as that seems like a logical place for this. In a regular nofib run there were 30 occurences of this pattern. Test Plan: ci Reviewers: bgamari, simonmar, dfeuer, jrtc27, tdammers Reviewed By: bgamari, simonmar Subscribers: tdammers, dfeuer, rwbarton, thomie, carter GHC Trac Issues: #15188 Differential Revision: https://phabricator.haskell.org/D4740
* Hoopl: improve postorder calculationMichal Terepeta2018-03-195-0/+83
- Fix the naming and comments to indicate that we are calculating *reverse* postorder (and not the standard postorder). - Rewrite the calculation to avoid CPS code. I found it fairly difficult to understand and the new one seems faster (according to nofib, decreases compiler allocations by 0.2%) - Remove `LabelsPtr`, which seems unnecessary and could be *really* confusing. For instance, previously: `postorder_dfs_from <block with label X>` and `postorder_dfs_from <label X>` would actually mean quite different things (and give different results). - Change the `Dataflow` module to always use entry of the graph for reverse postorder calculation. This should be the only change in behavior of this commit. Previously, if the caller provided initial facts for some of the labels, we would use those labels for our postorder calculation. However, I don't think that's correct in general - if the initial facts did not contain the entry of the graph, we would never analyze the blocks reachable from the entry but unreachable from the labels provided with the initial facts. It seems that the only analysis that used this was proc-point analysis, which I think would always include the entry block (so I don't think there's any bug due to this). Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: ./validate Reviewers: bgamari, simonmar Reviewed By: simonmar Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4464