summaryrefslogtreecommitdiff
path: root/compiler/codeGen/StgCmmPrim.hs
Commit message (Collapse)AuthorAgeFilesLines
* LLVM: Use generic code for small size quot-rem opsPeter Trommler2018-11-221-2/+2
|
* Introduce Int16# and Word16#Abhiroop Sarkar2018-11-171-0/+45
| | | | | | | | | | | | This builds off of D4475. Bumps binary submodule. Reviewers: carter, AndreasK, hvr, goldfire, bgamari, simonmar Subscribers: rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D5006
* [LlvmCodeGen] Fixes for Int8#/Word8#Michal Terepeta2018-11-071-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes two isssues: - Using bitcast for MO_XX_Conv Arguments to a bitcast must be of the same size. We should be using `trunc` and `zext` instead. - Using unsupported MO_*_QuotRem for LLVM The two primops `MO_*_QuotRem` are not supported by the LLVM backend, so we shouldn't use them for `Int8#`/`Word8#` (just as we do not use them for `Int#`/`Word#`). Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: manually run tests with WAY=llvm Reviewers: bgamari, simonmar Reviewed By: bgamari Subscribers: rwbarton, carter GHC Trac Issues: #15864 Differential Revision: https://phabricator.haskell.org/D5304
* Add Int8# and Word8#Michal Terepeta2018-11-021-14/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the first step of implementing: https://github.com/ghc-proposals/ghc-proposals/pull/74 The main highlights/changes: primops.txt.pp gets two new sections for two new primitive types for signed and unsigned 8-bit integers (Int8# and Word8 respectively) along with basic arithmetic and comparison operations. PrimRep/RuntimeRep get two new constructors for them. All of the primops translate into the existing MachOPs. For CmmCalls the codegen will now zero-extend the values at call site (so that they can be moved to the right register) and then truncate them back their original width. x86 native codegen needed some updates, since it wasn't able to deal with the new widths, but all the changes are quite localized. LLVM backend seems to just work. This is the second attempt at merging this, after the first attempt in D4475 had to be backed out due to regressions on i386. Bumps binary submodule. Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: ./validate (on both x86-{32,64}) Reviewers: bgamari, hvr, goldfire, simonmar Subscribers: rwbarton, carter Differential Revision: https://phabricator.haskell.org/D5258
* Fix dataToTag# argument evaluationÖmer Sinan Ağacan2018-10-101-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | See #15696 for more details. We now always enter dataToTag# argument (done in generated Cmm, in StgCmmExpr). Any high-level optimisations on dataToTag# applications are done by the simplifier. Looking at tag bits (instead of reading the info table) for small types is left to another diff. Incorrect test T14626 is removed. We no longer do this optimisation (see comment:44, comment:45, comment:60). Comments and notes about special cases around dataToTag# are removed. We no longer have any special cases around it in Core. Other changes related to evaluating primops (seq# and dataToTag#) will be pursued in follow-up diffs. Test Plan: Validates with three regression tests Reviewers: simonpj, simonmar, hvr, bgamari, dfeuer Reviewed By: simonmar Subscribers: rwbarton, carter GHC Trac Issues: #15696 Differential Revision: https://phabricator.haskell.org/D5201
* Revert "Add Int8# and Word8#"Ben Gamari2018-10-091-60/+14
| | | | | | | | | This unfortunately broke i386 support since it introduced references to byte-sized registers that don't exist on that architecture. Reverts binary submodule This reverts commit 5d5307f943d7581d7013ffe20af22233273fba06.
* Add Int8# and Word8#Michal Terepeta2018-10-071-14/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the first step of implementing: https://github.com/ghc-proposals/ghc-proposals/pull/74 The main highlights/changes: - `primops.txt.pp` gets two new sections for two new primitive types for signed and unsigned 8-bit integers (`Int8#` and `Word8` respectively) along with basic arithmetic and comparison operations. `PrimRep`/`RuntimeRep` get two new constructors for them. All of the primops translate into the existing `MachOP`s. - For `CmmCall`s the codegen will now zero-extend the values at call site (so that they can be moved to the right register) and then truncate them back their original width. - x86 native codegen needed some updates, since it wasn't able to deal with the new widths, but all the changes are quite localized. LLVM backend seems to just work. Bumps binary submodule. Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: ./validate with new tests Reviewers: hvr, goldfire, bgamari, simonmar Subscribers: Abhiroop, dfeuer, rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4475
* Add a missing write barrier to small array writesÖmer Sinan Ağacan2018-10-061-0/+1
| | | | | | | | | | | | | | | | Write barriers for large array writes were added in D2525, as a part of #12469. However it seems we forgot about small arrays. This patch adds the same write barrier to small array writes. Reviewers: simonmar, bgamari Reviewed By: simonmar Subscribers: rwbarton, carter GHC Trac Issues: #12469 Differential Revision: https://phabricator.haskell.org/D5209
* Finish stable splitDavid Feuer2018-08-291-8/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Long ago, the stable name table and stable pointer tables were one. Now, they are separate, and have significantly different implementations. I believe the time has come to finish the split that began in #7674. * Divide `rts/Stable` into `rts/StableName` and `rts/StablePtr`. * Give each table its own mutex. * Add FFI functions `hs_lock_stable_ptr_table` and `hs_unlock_stable_ptr_table` and document them. These are intended to replace the previously undocumented `hs_lock_stable_tables` and `hs_lock_stable_tables`, which are now documented as deprecated synonyms. * Make `eqStableName#` use pointer equality instead of unnecessarily comparing stable name table indices. Reviewers: simonmar, bgamari, erikd Reviewed By: bgamari Subscribers: rwbarton, carter GHC Trac Issues: #15555 Differential Revision: https://phabricator.haskell.org/D5084
* Fix precision of asinh/acosh/atanh by making them primopsArtem Pelenitsyn2018-08-211-0/+6
| | | | | | | | | | Reviewers: hvr, bgamari, simonmar, jrtc27 Reviewed By: bgamari Subscribers: alpmestan, rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D5034
* Turn on MonadFail desugaring by defaultHerbert Valerio Riedel2018-08-071-16/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This contains two commits: ---- Make GHC's code-base compatible w/ `MonadFail` There were a couple of use-sites which implicitly used pattern-matches in `do`-notation even though the underlying `Monad` didn't explicitly support `fail` This refactoring turns those use-sites into explicit case discrimations and adds an `MonadFail` instance for `UniqSM` (`UniqSM` was the worst offender so this has been postponed for a follow-up refactoring) --- Turn on MonadFail desugaring by default This finally implements the phase scheduled for GHC 8.6 according to https://prime.haskell.org/wiki/Libraries/Proposals/MonadFail#Transitionalstrategy This also preserves some tests that assumed MonadFail desugaring to be active; all ghc boot libs were already made compatible with this `MonadFail` long ago, so no changes were needed there. Test Plan: Locally performed ./validate --fast Reviewers: bgamari, simonmar, jrtc27, RyanGlScott Reviewed By: bgamari Subscribers: bgamari, RyanGlScott, rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D5028
* Rename some mutable closure types for consistencyÖmer Sinan Ağacan2018-06-051-8/+8
| | | | | | | | | | | | | | | | | | | | | | | SMALL_MUT_ARR_PTRS_FROZEN0 -> SMALL_MUT_ARR_PTRS_FROZEN_DIRTY SMALL_MUT_ARR_PTRS_FROZEN -> SMALL_MUT_ARR_PTRS_FROZEN_CLEAN MUT_ARR_PTRS_FROZEN0 -> MUT_ARR_PTRS_FROZEN_DIRTY MUT_ARR_PTRS_FROZEN -> MUT_ARR_PTRS_FROZEN_CLEAN Naming is now consistent with other CLEAR/DIRTY objects (MVAR, MUT_VAR, MUT_ARR_PTRS). (alternatively we could rename MVAR_DIRTY/MVAR_CLEAN etc. to MVAR0/MVAR) Removed a few comments in Scav.c about FROZEN0 being on the mut_list because it's now clear from the closure type. Reviewers: bgamari, simonmar, erikd Reviewed By: simonmar Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4784
* Add 'addWordC#' PrimOpSebastian Graf2018-05-051-10/+62
| | | | | | | | | | | | | | | | | | | This is mostly for congruence with 'subWordC#' and '{add,sub}IntC#'. I found 'plusWord2#' while implementing this, which both lacks documentation and has a slightly different specification than 'addWordC#', which means the generic implementation is unnecessarily complex. While I was at it, I also added lacking meta-information on PrimOps and refactored 'subWordC#'s generic implementation to be branchless. Reviewers: bgamari, simonmar, jrtc27, dfeuer Reviewed By: bgamari, dfeuer Subscribers: dfeuer, thomie, carter Differential Revision: https://phabricator.haskell.org/D4592
* Add unaligned bytearray access primops. Fixes #4442.Reiner Pope2018-03-251-0/+53
| | | | | | | | | | | | Reviewers: bgamari, simonmar Reviewed By: bgamari Subscribers: dfeuer, rwbarton, thomie, carter GHC Trac Issues: #4442 Differential Revision: https://phabricator.haskell.org/D4488
* myThreadId# is trivial; make it an inline primopSimon Marlow2018-02-181-0/+3
| | | | | | | | | | | | | | | The pattern `threadCapability =<< myThreadId` is used a lot in code that uses `hs_try_putmvar`, I want to make it cheaper. Test Plan: validate Reviewers: bgamari, erikd Reviewed By: bgamari Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4381
* Tidy up and consolidate canned CmmReg and CmmGlobalsSimon Marlow2018-02-181-9/+9
| | | | | | | | | | | | Test Plan: validate Reviewers: bgamari, erikd Reviewed By: bgamari Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4380
* Add ptr-eq short-cut to `compareByteArrays#` primitiveHerbert Valerio Riedel2018-01-261-0/+43
| | | | | | | | | | | | | | | | | | This is an obvious optimisation whose overhead is neglectable but which significantly simplifies the common uses of `compareByteArrays#` which would otherwise require to make *careful* use of `reallyUnsafePtrEquality#` or (equally fragile) `byteArrayContents#` which can result in less optimal assembler code being generated. Test Plan: carefully examined generated cmm/asm code; validate via phab Reviewers: alexbiehl, bgamari, simonmar Reviewed By: bgamari, simonmar Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4319
* Add new mbmi and mbmi2 compiler flagsJohn Ky2018-01-211-0/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for the bit deposit and extraction operations provided by the BMI and BMI2 instruction set extensions on modern amd64 machines. Implement x86 code generator for pdep and pext. Properly initialise bmiVersion field. pdep and pext test cases Fix pattern match for pdep and pext instructions Fix build of pdep and pext code for 32-bit architectures Test Plan: Validate Reviewers: austin, simonmar, bgamari, angerman Reviewed By: bgamari Subscribers: trommler, carter, angerman, thomie, rwbarton, newhoggy GHC Trac Issues: #14206 Differential Revision: https://phabricator.haskell.org/D4236
* Get rid of some stuttering in comments and docsGabor Greif2017-12-191-1/+1
|
* Revert "Add new mbmi and mbmi2 compiler flags"Ben Gamari2017-11-221-78/+0
| | | | | | This broke the 32-bit build. This reverts commit f5dc8ccc29429d0a1d011f62b6b430f6ae50290c.
* Add new mbmi and mbmi2 compiler flagsJohn Ky2017-11-151-0/+78
| | | | | | | | | | | | | | | | | This adds support for the bit deposit and extraction operations provided by the BMI and BMI2 instruction set extensions on modern amd64 machines. Test Plan: Validate Reviewers: austin, simonmar, bgamari, hvr, goldfire, erikd Reviewed By: bgamari Subscribers: goldfire, erikd, trommler, newhoggy, rwbarton, thomie GHC Trac Issues: #14206 Differential Revision: https://phabricator.haskell.org/D4063
* Turn `compareByteArrays#` out-of-line primop into inline primopalexbiehl2017-10-291-1/+40
| | | | | | | | | | | | Depends on D4090 Reviewers: austin, bgamari, erikd, simonmar, alexbiehl Reviewed By: bgamari Subscribers: rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D4091
* compiler: introduce custom "GhcPrelude" PreludeHerbert Valerio Riedel2017-09-191-2/+2
| | | | | | | | | | | | | | | | | | This switches the compiler/ component to get compiled with -XNoImplicitPrelude and a `import GhcPrelude` is inserted in all modules. This is motivated by the upcoming "Prelude" re-export of `Semigroup((<>))` which would cause lots of name clashes in every modulewhich imports also `Outputable` Reviewers: austin, goldfire, bgamari, alanz, simonmar Reviewed By: bgamari Subscribers: goldfire, rwbarton, thomie, mpickering, bgamari Differential Revision: https://phabricator.haskell.org/D3989
* Use lengthIs and friends in more placesRyan Scott2017-06-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | While investigating #12545, I discovered several places in the code that performed length-checks like so: ``` length ts == 4 ``` This is not ideal, since the length of `ts` could be much longer than 4, and we'd be doing way more work than necessary! There are already a slew of helper functions in `Util` such as `lengthIs` that are designed to do this efficiently, so I found every place where they ought to be used and did just that. I also defined a couple more utility functions for list length that were common patterns (e.g., `ltLength`). Test Plan: ./validate Reviewers: austin, hvr, goldfire, bgamari, simonmar Reviewed By: bgamari, simonmar Subscribers: goldfire, rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D3622
* PPC NCG: Lower MO_*_Fabs as PowerPC fabs instructionPeter Trommler2017-05-011-2/+4
| | | | | | | | | | | | | | | | In Phab:D3265 we introduced MO_F32_Fabs and MO_F64_Fabs. This patch improves code generation by generating PowerPC fabs instructions. Test Plan: run numeric/should_run/numrun015 or validate Reviewers: austin, bgamari, hvr, simonmar, erikd Reviewed By: erikd Subscribers: rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D3512
* PPC NCG: Implement callish prim opsPeter Trommler2017-04-251-8/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provide PowerPC optimised implementations of callish prim ops. MO_?_QuotRem The generic implementation of quotient remainder prim ops uses a division and a remainder operation. There is no remainder on PowerPC and so we need to implement remainder "by hand" which results in a duplication of the divide operation when using the generic code. Avoid this duplication by implementing the prim op in the native code generator. MO_U_Mul2 Use PowerPC's instructions for long multiplication. Addition and subtraction Use PowerPC add/subtract with carry/overflow instructions MO_Clz and MO_Ctz Use PowerPC's CNTLZ instruction and implement count trailing zeros using count leading zeros MO_QuotRem2 Implement an algorithm given by Henry Warren in "Hacker's Delight" using PowerPC divide instruction. TODO: Use long division instructions when available (POWER7 and later). Test Plan: validate on AIX and 32-bit Linux Reviewers: simonmar, erikd, hvr, austin, bgamari Reviewed By: erikd, hvr, bgamari Subscribers: trofi, kgardas, thomie Differential Revision: https://phabricator.haskell.org/D2973
* Generate better fp abs for X86 and llvm with default cmm otherwiseDominic Steinitz2017-03-071-0/+34
| | | | | | | | | | | | | | | | | | | | | | | Currently we have this in libraries/base/GHC/Float.hs: ``` abs x | x == 0 = 0 -- handles (-0.0) | x > 0 = x | otherwise = negateFloat x ``` But 3-4 years ago it was noted that this was inefficient: https://mail.haskell.org/pipermail/libraries/2013-April/019690.html We can generate better code for X86 and llvm and for others generate some custom cmm code which is similar to what the compiler generates now. Reviewers: austin, simonmar, hvr, bgamari Reviewed By: bgamari Subscribers: dfeuer, thomie Differential Revision: https://phabricator.haskell.org/D3265
* Use newBlockId instead of newLabelCBen Gamari2016-11-291-1/+2
| | | | | | | | | | | | | | | This seems like a clearer name and the fewer functions that one needs to remember, the better. Test Plan: validate Reviewers: austin, simonmar, michalt Reviewed By: simonmar, michalt Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2735
* StgCmmPrim: Add missing write barrier.Peter Trommler2016-10-191-0/+4
| | | | | | | | | | | | | | | | | | | On architectures with weak memory consistency a write barrier is needed before the write to the pointer array. Fixes #12469 Test Plan: rebuilt Stackage nightly twice on powerpc64le Reviewers: hvr, rrnewton, erikd, austin, simonmar, bgamari Reviewed By: erikd, bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2525 GHC Trac Issues: #12469
* StgCmmPrim: Add missing MO_WriteBarrierBen Gamari2016-08-311-2/+5
| | | | | | | | | | | | | | Test Plan: Good question Reviewers: austin, trommler, simonmar, rrnewton Reviewed By: simonmar Subscribers: RyanGlScott, thomie Differential Revision: https://phabricator.haskell.org/D2495 GHC Trac Issues: #12469
* Remove StgRubbishArg and CmmArgÖmer Sinan Ağacan2016-08-101-13/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The idea behind adding special "rubbish" arguments was in unboxed sum types depending on the tag some arguments are not used and we don't want to move some special values (like 0 for literals and some special pointer for boxed slots) for those arguments (to stack locations or registers). "StgRubbishArg" was an indicator to the code generator that the value won't be used. During Stg-to-Cmm we were then not generating any move or store instructions at all. This caused problems in the register allocator because some variables were only initialized in some code paths. As an example, suppose we have this STG: (after unarise) Lib.$WT = \r [dt_sit] case case dt_sit of { Lib.F dt_siv [Occ=Once] -> (#,,#) [1# dt_siv StgRubbishArg::GHC.Prim.Int#]; Lib.I dt_siw [Occ=Once] -> (#,,#) [2# StgRubbishArg::GHC.Types.Any dt_siw]; } of dt_six { (#,,#) us_giC us_giD us_giE -> Lib.T [us_giC us_giD us_giE]; }; This basically unpacks a sum type to an unboxed sum with 3 fields, and then moves the unboxed sum to a constructor (`Lib.T`). This is the Cmm for the inner case expression (case expression in the scrutinee position of the outer case): ciN: ... -- look at dt_sit's tag if (_ciT::P64 != 1) goto ciS; else goto ciR; ciS: -- Tag is 2, i.e. Lib.F _siw::I64 = I64[_siu::P64 + 6]; _giE::I64 = _siw::I64; _giD::P64 = stg_RUBBISH_ENTRY_info; _giC::I64 = 2; goto ciU; ciR: -- Tag is 1, i.e. Lib.I _siv::P64 = P64[_siu::P64 + 7]; _giD::P64 = _siv::P64; _giC::I64 = 1; goto ciU; Here one of the blocks `ciS` and `ciR` is executed and then the execution continues to `ciR`, but only `ciS` initializes `_giE`, in the other branch `_giE` is not initialized, because it's "rubbish" in the STG and so we don't generate an assignment during code generator. The code generator then panics during the register allocations: ghc-stage1: panic! (the 'impossible' happened) (GHC version 8.1.20160722 for x86_64-unknown-linux): LocalReg's live-in to graph ciY {_giE::I64} (`_giD` is also "rubbish" in `ciS`, but it's still initialized because it's a pointer slot, we have to initialize it otherwise garbage collector follows the pointer to some random place. So we only remove assignment if the "rubbish" arg has unboxed type.) This patch removes `StgRubbishArg` and `CmmArg`. We now always initialize rubbish slots. If the slot is for boxed types we use the existing `absentError`, otherwise we initialize the slot with literal 0. Reviewers: simonpj, erikd, austin, simonmar, bgamari Reviewed By: erikd Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2446
* Implement unboxed sum primitive typeÖmer Sinan Ağacan2016-07-211-12/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch implements primitive unboxed sum types, as described in https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes. Main changes are: - Add new syntax for unboxed sums types, terms and patterns. Hidden behind `-XUnboxedSums`. - Add unlifted unboxed sum type constructors and data constructors, extend type and pattern checkers and desugarer. - Add new RuntimeRep for unboxed sums. - Extend unarise pass to translate unboxed sums to unboxed tuples right before code generation. - Add `StgRubbishArg` to `StgArg`, and a new type `CmmArg` for better code generation when sum values are involved. - Add user manual section for unboxed sums. Some other changes: - Generalize `UbxTupleRep` to `MultiRep` and `UbxTupAlt` to `MultiValAlt` to be able to use those with both sums and tuples. - Don't use `tyConPrimRep` in `isVoidTy`: `tyConPrimRep` is really wrong, given an `Any` `TyCon`, there's no way to tell what its kind is, but `kindPrimRep` and in turn `tyConPrimRep` returns `PtrRep`. - Fix some bugs on the way: #12375. Not included in this patch: - Update Haddock for new the new unboxed sum syntax. - `TemplateHaskell` support is left as future work. For reviewers: - Front-end code is mostly trivial and adapted from unboxed tuple code for type checking, pattern checking, renaming, desugaring etc. - Main translation routines are in `RepType` and `UnariseStg`. Documentation in `UnariseStg` should be enough for understanding what's going on. Credits: - Johan Tibell wrote the initial front-end and interface file extensions. - Simon Peyton Jones reviewed this patch many times, wrote some code, and helped with debugging. Reviewers: bgamari, alanz, goldfire, RyanGlScott, simonpj, austin, simonmar, hvr, erikd Reviewed By: simonpj Subscribers: Iceland_jack, ggreif, ezyang, RyanGlScott, goldfire, thomie, mpickering Differential Revision: https://phabricator.haskell.org/D2259
* Compact RegionsGiovanni Campagna2016-07-201-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | This brings in initial support for compact regions, as described in the ICFP 2015 paper "Efficient Communication and Collection with Compact Normal Forms" (Edward Z. Yang et.al.) and implemented by Giovanni Campagna. Some things may change before the 8.2 release, but I (Simon M.) wanted to get the main patch committed so that we can iterate. What documentation there is is in the Data.Compact module in the new compact package. We'll need to extend and polish the documentation before the release. Test Plan: validate (new test cases included) Reviewers: ezyang, simonmar, hvr, bgamari, austin Subscribers: vikraman, Yuras, RyanGlScott, qnikst, mboes, facundominguez, rrnewton, thomie, erikd Differential Revision: https://phabricator.haskell.org/D1264 GHC Trac Issues: #11493
* Drop pre-AMP compatibility CPP conditionalsHerbert Valerio Riedel2015-12-311-2/+0
| | | | | | | | | | | | Since GHC 8.1/8.2 only needs to be bootstrap-able by GHC 7.10 and GHC 8.0 (and GHC 8.2), we can now finally drop all that pre-AMP compatibility CPP-mess for good! Reviewers: austin, goldfire, bgamari Subscribers: goldfire, thomie, erikd Differential Revision: https://phabricator.haskell.org/D1724
* Add subWordC# on x86ishNikita Karetnikov2015-10-311-0/+17
| | | | | | | | | | | | | | | This adds a subWordC# primop which implements subtraction with overflow reporting. Reviewers: tibbe, goldfire, rwbarton, bgamari, austin, hvr Reviewed By: bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1334 GHC Trac Issues: #10962
* s/StgArrWords/StgArrBytes/Siddhanathan Shanmugam2015-09-111-4/+4
| | | | | | | | | | Rename StgArrWords to StgArrBytes (see Trac #8552) Reviewed By: austin Differential Revision: https://phabricator.haskell.org/D1233 GHC Trac Issues: #8552
* Fix trac #10413Ben Gamari2015-09-021-2/+5
| | | | | | | | | | | | | | Test Plan: Validate. Reviewers: austin, tibbe, bgamari Reviewed By: tibbe, bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1194 GHC Trac Issues: #10413
* Implement getSizeofMutableByteArrayOp primopBen Gamari2015-08-211-0/+5
| | | | | | | | | | | | | | | Now since ByteArrays are mutable we need to be more explicit about when the size is queried. Test Plan: Add testcase and validate Reviewers: goldfire, hvr, austin Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1139 GHC Trac Issues: #9447
* Support MO_U_QuotRem2 in LLVM backendMichal Terepeta2015-08-031-1/+2
| | | | | | | | | | | | | | | | | | | This adds support for MO_U_QuotRem2 in LLVM backend. Similarly to MO_U_Mul2 we use the standard LLVM instructions (in this case 'udiv' and 'urem') but do the computation on double the word width (e.g., for 64-bit we will do them on 128 registers). Test Plan: validate Reviewers: rwbarton, austin, bgamari Reviewed By: bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1100 GHC Trac Issues: #9430
* LlvmCodeGen: add support for MO_U_Mul2 CallishMachOpMichal Terepeta2015-07-201-1/+2
| | | | | | | | | | | | | | | | | | | This adds support MO_U_Mul2 to the LLVM backend by simply using 'mul' instruction but operating at twice the bit width (e.g., for 64 bit words we will generate mul that operates on 128 bits and then extract the two 64 bit values for the result of the CallishMachOp). Test Plan: validate Reviewers: rwbarton, austin, bgamari Reviewed By: bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1068 GHC Trac Issues: #9430
* Support MO_{Add,Sub}IntC and MO_Add2 in the LLVM backendMichal Terepeta2015-07-041-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | This includes: - Adding new LlvmType called LMStructP that represents an unpacked struct (this is necessary since LLVM's instructions the llvm.sadd.with.overflow.* return an unpacked struct). - Modifications to LlvmCodeGen.CodeGen to generate the LLVM instructions for the primops. - Modifications to StgCmmPrim to actually use those three instructions if we use the LLVM backend (so far they were only used for NCG). Test Plan: validate Reviewers: austin, rwbarton, bgamari Reviewed By: bgamari Subscribers: thomie, bgamari Differential Revision: https://phabricator.haskell.org/D991 GHC Trac Issues: #9430
* Encode alignment in MO_Memcpy and friendsBen Gamari2015-06-161-25/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Alignment needs to be a compile-time constant. Previously the code generators had to jump through hoops to ensure this was the case as the alignment was passed as a CmmExpr in the arguments list. Now we take care of this up front. This fixes #8131. Authored-by: Reid Barton <rwbarton@gmail.com> Dusted-off-by: Ben Gamari <ben@smart-cactus.org> Tests for T8131 Test Plan: Validate Reviewers: rwbarton, austin Reviewed By: rwbarton, austin Subscribers: bgamari, carter, thomie Differential Revision: https://phabricator.haskell.org/D624 GHC Trac Issues: #8131
* Changing prefetch primops to have a `seq`-like interfaceCarter Tazio Schonwald2014-12-151-29/+51
| | | | | | | | | | | | | | | | | | | | | | Summary: The current primops for prefetching do not properly work in pure code; namely, the primops are not 'hoisted' into the correct call sites based on when arguments are evaluated. Instead, they should use a `seq`-like interface, which will cause it to be evaluated when the needed term is. See #9353 for the full discussion. Test Plan: updated tests for pure prefetch in T8256 to reflect the design changes in #9353 Reviewers: simonmar, hvr, ekmett, austin Reviewed By: ekmett, austin Subscribers: merijn, thomie, carter, simonmar Differential Revision: https://phabricator.haskell.org/D350 GHC Trac Issues: #9353
* Make Applicative a superclass of MonadAustin Seipp2014-09-091-0/+4
| | | | | | | | | | | | | | | | | | | | | Summary: This includes pretty much all the changes needed to make `Applicative` a superclass of `Monad` finally. There's mostly reshuffling in the interests of avoid orphans and boot files, but luckily we can resolve all of them, pretty much. The only catch was that Alternative/MonadPlus also had to go into Prelude to avoid this. As a result, we must update the hsc2hs and haddock submodules. Signed-off-by: Austin Seipp <austin@well-typed.com> Test Plan: Build things, they might not explode horribly. Reviewers: hvr, simonmar Subscribers: simonmar Differential Revision: https://phabricator.haskell.org/D13
* Add MO_AddIntC, MO_SubIntC MachOps and implement in X86 backendReid Barton2014-08-231-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | Summary: These MachOps are used by addIntC# and subIntC#, which in turn are used in integer-gmp when adding or subtracting small Integers. The following benchmark shows a ~6% speedup after this commit on x86_64 (building GHC with BuildFlavour=perf). {-# LANGUAGE MagicHash #-} import GHC.Exts import Criterion.Main count :: Int -> Integer count (I# n#) = go n# 0 where go :: Int# -> Integer -> Integer go 0# acc = acc go n# acc = go (n# -# 1#) $! acc + 1 main = defaultMain [bgroup "count" [bench "100" $ whnf count 100]] Differential Revision: https://phabricator.haskell.org/D140
* Implement new CLZ and CTZ primops (re #9340)Herbert Valerio Riedel2014-08-141-0/+28
| | | | | | | | | | | | | | | | | | | | | | | | This implements the new primops clz#, clz32#, clz64#, ctz#, ctz32#, ctz64# which provide efficient implementations of the popular count-leading-zero and count-trailing-zero respectively (see testcase for a pure Haskell reference implementation). On x86, NCG as well as LLVM generates code based on the BSF/BSR instructions (which need extra logic to make the 0-case well-defined). Test Plan: validate and succesful tests on i686 and amd64 Reviewers: rwbarton, simonmar, ezyang, austin Subscribers: simonmar, relrod, ezyang, carter Differential Revision: https://phabricator.haskell.org/D144 GHC Trac Issues: #9340
* StgCmmPrim: add note to stop using fixed size signed types for sizesJohan Tibell2014-08-121-0/+5
| | | | | We use fixed size signed types to e.g. represent array sizes. This means that the size can overflow.
* shouldInlinePrimOp: Fix Int overflowJohan Tibell2014-08-121-22/+38
| | | | | | | | | | | | | | | | There were two overflow issues in shouldInlinePrimOp. The first one is due to a negative CmmInt literal being created if the array size was given as larger than 2^63-1 (on a 64-bit platform.) This meant that large array sizes could compare as being smaller than maxInlineAllocSize. The second issue is that we casted the Integer to an Int in the comparison, which again meant that large array sizes could compare as being smaller than maxInlineAllocSize. The attempt to allocate a large array inline then caused a segfault. Fixes #9416.
* Make IntAddCOp, IntSubCOp into GenericOpsReid Barton2014-08-101-57/+65
| | | | | | | | | ... in preparation for backend-specific implementations. No functional changes in this commit (except in panic messages for ill-formed Cmm). Differential Revision: https://phabricator.haskell.org/D138
* Re-add more primops for atomic ops on byte arraysJohan Tibell2014-06-301-0/+94
| | | | | | | | | | | | | | | | | | | | | | | This is the second attempt to add this functionality. The first attempt was reverted in 950fcae46a82569e7cd1fba1637a23b419e00ecd, due to register allocator failure on x86. Given how the register allocator currently works, we don't have enough registers on x86 to support cmpxchg using complicated addressing modes. Instead we fall back to a simpler addressing mode on x86. Adds the following primops: * atomicReadIntArray# * atomicWriteIntArray# * fetchSubIntArray# * fetchOrIntArray# * fetchXorIntArray# * fetchAndIntArray# Makes these pre-existing out-of-line primops inline: * fetchAddIntArray# * casIntArray#