| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On Linux/i386 the 64bit `__builtin_ctzll()` instrinsic doesn't get
inlined by GCC but rather a short `__ctzdi2` runtime function is
inserted when needed into compiled object files.
This causes failures for the four test-cases
TEST="T8639_api T8628 dynCompileExpr T5313"
with error messages of the kind
dynCompileExpr: .../libraries/ghc-prim/dist-install/build/libHSghcpr_BE58KUgBe9ELCsPXiJ1Q2r.a: unknown symbol `__ctzdi2'
dynCompileExpr: dynCompileExpr: unable to load package `ghc-prim'
This workaround forces GCC on 32bit x86 to to express `hs_ctz64` in
terms of the 32bit `__builtin_ctz()` (this is no loss, as there's no
64bit BSF instruction on i686 anyway) and thus avoid the problematic
out-of-line runtime function.
Note: `__builtin_ctzll()` is used since
e0c1767d0ea8d12e0a4badf43682a08784e379c6 (re #9340)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This implements the new primops
clz#, clz32#, clz64#,
ctz#, ctz32#, ctz64#
which provide efficient implementations of the popular
count-leading-zero and count-trailing-zero respectively
(see testcase for a pure Haskell reference implementation).
On x86, NCG as well as LLVM generates code based on the BSF/BSR
instructions (which need extra logic to make the 0-case well-defined).
Test Plan: validate and succesful tests on i686 and amd64
Reviewers: rwbarton, simonmar, ezyang, austin
Subscribers: simonmar, relrod, ezyang, carter
Differential Revision: https://phabricator.haskell.org/D144
GHC Trac Issues: #9340
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the second attempt to add this functionality. The first
attempt was reverted in 950fcae46a82569e7cd1fba1637a23b419e00ecd, due
to register allocator failure on x86. Given how the register
allocator currently works, we don't have enough registers on x86 to
support cmpxchg using complicated addressing modes. Instead we fall
back to a simpler addressing mode on x86.
Adds the following primops:
* atomicReadIntArray#
* atomicWriteIntArray#
* fetchSubIntArray#
* fetchOrIntArray#
* fetchXorIntArray#
* fetchAndIntArray#
Makes these pre-existing out-of-line primops inline:
* fetchAddIntArray#
* casIntArray#
|
|
|
|
|
|
|
|
| |
This commit caused the register allocator to fail on i386.
This reverts commit d8abf85f8ca176854e9d5d0b12371c4bc402aac3 and
04dd7cb3423f1940242fdfe2ea2e3b8abd68a177 (the second being a fix to
the first).
|
|
|
|
|
| |
clang chose to not implement this function. See
http://llvm.org/bugs/show_bug.cgi?id=8842
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add more primops for atomic ops on byte arrays
Adds the following primops:
* atomicReadIntArray#
* atomicWriteIntArray#
* fetchSubIntArray#
* fetchOrIntArray#
* fetchXorIntArray#
* fetchAndIntArray#
Makes these pre-existing out-of-line primops inline:
* fetchAddIntArray#
* casIntArray#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On 64-bit Mac OS, gcc 4.2 (which comes with Xcode 4.6) generates code
that assumes that an argument that is smaller than the register
it is passed in has been sign- or zero-extended. But ghc thinks
the types of the PopCnt*Op primops are Word# -> Word#, so it passes
the entire argument word to the hs_popcnt* function as though it was
declared to have an argument of type StgWord. Segfaults ensue.
The easiest fix is to sidestep all this zero-extension business
by declaring the hs_popcnt* functions to take a whole StgWord (when their
argument would fit in a register), thereby matching the list of primops.
Fixes #7684.
|
|
|
|
|
|
|
| |
In the previous patch I used Int64# as a return value for
comparison primops used on 32bit machines for comparing Int64#
and Word64#. This obviously wasn't a good idea. This patch changes
return type from emulated Int64# to a native Int#.
|
|
|
|
|
| |
For a deatiled discussion of this changes please visit the wiki page:
http://hackage.haskell.org/trac/ghc/wiki/PrimBool
|
|
|
|
| |
Patch from Vincent Hanquez
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
All the hs_popcntX functions should return an StgWord in accordance with
the primop types.
I'm not sure whether the argument should also become an StgWord
(except for the 64-bit versions), the primops have type Word# -> Word#,
but we have e.g. StgWord hs_popcnt8(StgWord8 x).
StgWord8 is an unsigned char (usually, at least), but I think arguments
are passed at least word-sized to C-functions, so it probably works.
For the moment it works and passes tests, I'll ask tibbe to be sure.
|
|
|
|
| |
fallbacks are referred to by code generated by GHC.
|
| |
|
| |
|
| |
|
| |
|
|
|