| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
PowerPC does not do delay slots and there is also no requirement
to put extra instructions between FP operations and branches.
Test Plan: None. Comment change only.
Reviewers: austin, simonmar
Reviewed By: austin, simonmar
Subscribers: simonmar, relrod, carter
Differential Revision: https://phabricator.haskell.org/D40
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the second attempt to add this functionality. The first
attempt was reverted in 950fcae46a82569e7cd1fba1637a23b419e00ecd, due
to register allocator failure on x86. Given how the register
allocator currently works, we don't have enough registers on x86 to
support cmpxchg using complicated addressing modes. Instead we fall
back to a simpler addressing mode on x86.
Adds the following primops:
* atomicReadIntArray#
* atomicWriteIntArray#
* fetchSubIntArray#
* fetchOrIntArray#
* fetchXorIntArray#
* fetchAndIntArray#
Makes these pre-existing out-of-line primops inline:
* fetchAddIntArray#
* casIntArray#
|
|
|
|
|
|
|
|
| |
This commit caused the register allocator to fail on i386.
This reverts commit d8abf85f8ca176854e9d5d0b12371c4bc402aac3 and
04dd7cb3423f1940242fdfe2ea2e3b8abd68a177 (the second being a fix to
the first).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add more primops for atomic ops on byte arrays
Adds the following primops:
* atomicReadIntArray#
* atomicWriteIntArray#
* fetchSubIntArray#
* fetchOrIntArray#
* fetchXorIntArray#
* fetchAndIntArray#
Makes these pre-existing out-of-line primops inline:
* fetchAddIntArray#
* casIntArray#
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We now emit
movq %rdi,16(%r14,%rsi,8)
instead of
leaq 16(%r14),%rax
movq %rdi,(%rax,%rsi,8)
This helps e.g. byte array indexing.
|
|
|
|
|
|
|
| |
A previous fix to this was wrong: f5879acd018494b84233f26fba828ce376d0f81d
and left some unreachable code behind. So rather than try to be clever and
do this at the same time as the strongly-connected-component analysis, I'm
doing a separate reachability pass first.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In some cases, the layout of the LANGUAGE/OPTIONS_GHC lines has been
reorganized, while following the convention, to
- place `{-# LANGUAGE #-}` pragmas at the top of the source file, before
any `{-# OPTIONS_GHC #-}`-lines.
- Moreover, if the list of language extensions fit into a single
`{-# LANGUAGE ... -#}`-line (shorter than 80 characters), keep it on one
line. Otherwise split into `{-# LANGUAGE ... -#}`-lines for each
individual language extension. In both cases, try to keep the
enumeration alphabetically ordered.
(The latter layout is preferable as it's more diff-friendly)
While at it, this also replaces obsolete `{-# OPTIONS ... #-}` pragma
occurences by `{-# OPTIONS_GHC ... #-}` pragmas.
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
|
|
|
| |
This checks that all the required extensions are enabled for the
inferred type signature.
Updates binary and vector submodules.
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
|
|
|
| |
This adds -fmax-inline-memcpy-insns and -fmax-inline-memset-insns.
These flags control when we inline calls to memcpy/memset with
statically known arguments. The flag naming style is taken from GCC
and the same limit is used by both GCC and LLVM.
|
|
|
|
| |
Loads should now handle up to 32 bit offsets.
|
| |
|
| |
|
| |
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
| |
|
|
|
|
|
|
| |
This way CPP conditionals can be avoided for the transition period.
Signed-off-by: Herbert Valerio Riedel <hvr@gnu.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for several new primitive operations which
support using processor-specific instructions to help guide data and
cache locality decisions. We have levels ranging from [0..3]
For LLVM, we generate llvm.prefetch intrinsics at the proper locality
level (similar to GCC.)
For x86 we generate prefetch{NTA, t2, t1, t0} instructions. On SPARC and
PowerPC, the locality levels are ignored.
This closes #8256.
Authored-by: Carter Tazio Schonwald <carter.schonwald@gmail.com>
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The problem with unreachable code is that it might refer to undefined
registers. This happens accidentally: a block can be orphaned by an
optimisation, for example when the result of a comparsion becomes
known.
The register allocator panics when it finds an undefined register,
because they shouldn't occur in generated code. So we need to also
discard unreachable code to prevent this panic being triggered by
optimisations.
The register alloator already does a strongly-connected component
analysis, so it ought to be easy to make it discard unreachable code
as part of that traversal. It turns out that we need a different
variant of the scc algorithm to do that (see Digraph), however the new
variant also generates slightly better code by putting the blocks
within a loop in a better order for register allocation.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
width and element type.
SIMD primops are now polymorphic in vector size and element type, but
only internally to the compiler. More specifically, utils/genprimopcode
has been extended so that it "knows" about SIMD vectors. This allows us
to, for example, write a single definition for the "add two vectors"
primop in primops.txt.pp and have it instantiated at many vector types.
This generates a primop in GHC.Prim for each vector type at which "add
two vectors" is instantiated, but only one data constructor for the
PrimOp data type, so the code generator is much, much simpler.
|
|
|
|
|
| |
Authored-by: David Luposchainsky <dluposchainsky@gmail.com>
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch encompasses most of the basic infrastructure for GHCJS. It
includes:
* A new extension, -XJavaScriptFFI
* A new architecture, ArchJavaScript
* Parser and lexer support for 'foreign import javascript', only
available under -XJavaScriptFFI, using ArchJavaScript.
* As a knock-on, there is also a new 'WayCustom' constructor in
DynFlags, so clients of the GHC API can add custom 'tags' to their
built files. This should be useful for other users as well.
The remaining changes are really just the resulting fallout, making sure
all the cases are handled appropriately for DynFlags and Platform.
Authored-by: Luite Stegeman <stegeman@gmail.com>
Signed-off-by: Austin Seipp <aseipp@pobox.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Exposes bSwap{,16,32,64}# primops
* Add a new machop: MO_BSwap
* Use a Stg implementation (hs_bswap{16,32,64}) for other implementation
in NCG.
* Generate bswap in X86 NCG for 32 and 64 bits, and for 16 bits, bswap+shr
instead of using xchg.
* Generate llvm.bswap intrinsics in llvm codegen.
Authored-by: Vincent Hanquez <tab@snarc.org>
Signed-off-by: Austin Seipp <aseipp@pobox.com>
|
|
|
|
|
|
| |
Clang doesn't like whitespace between macro and arguments.
Signed-off-by: Austin Seipp <aseipp@pobox.com>
|
|
|
|
| |
This reverts commit 1c5b0511a89488f5280523569d45ee61c0d09ffa.
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Exposes bSwap{,16,32,64}# primops
* Add a new machops MO_BSwap
* Use a Stg implementation (hs_bswap{16,32,64}) for other implementation
in NCG.
* Generate bswap in X86 NCG for 32 and 64 bits, and for 16 bits, bswap+shr
instead of using xchg.
* Generate llvm.bswap intrinsics in llvm codegen.
Patch from Vincent Hanquez.
|
|
|
|
| |
It doesn't actually use it yet
|