| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We occasionally need to reserve some temporary memory in a primop for
passing to a foreign function. We've been using the stack for this,
but when we moved to high-level Cmm it became quite fragile because
primops are in high-level Cmm and the stack is supposed to be under
the control of the Cmm pipeline.
So this change puts things on a firmer footing by adding a new Cmm
construct 'reserve'. e.g. in decodeFloat_Int#:
reserve 2 = tmp {
mp_tmp1 = tmp + WDS(1);
mp_tmp_w = tmp;
/* Perform the operation */
ccall __decodeFloat_Int(mp_tmp1 "ptr", mp_tmp_w "ptr", arg);
r1 = W_[mp_tmp1];
r2 = W_[mp_tmp_w];
}
reserve is described in CmmParse.y.
Unfortunately the argument to reserve must be a compile-time constant.
We might have to extend the parser to allow expressions with
arithmetic operators if this is too restrictive.
Note also that the return instruction for the procedure must be
outside the scope of the reserved stack area, so we have to extract
the values from the reserved area before we close the scope. This
means some more local variables (r1, r2 in the example above). The
generated code is more or less identical to what we had before though.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This bug only shows up when you are using proc-point splitting.
What was happening was:
* We generate a proc-point for the stack check
* And an info table
* We eliminate the stack check because it's redundant
* And the dangling info table caused a panic in
CmmBuildInfoTables.bundle
|
| |
|
| |
|
|
|
|
| |
Signed-off-by: Herbert Valerio Riedel <hvr@gnu.org>
|
|
|
|
| |
This reverts commit 2f5db98e90cf0cff1a11971c85f108a7480528ed.
|
|
|
|
|
|
|
| |
Inlining global registers and constants made code slightly larger in
some cases. I finally got around to looking into why, and discovered
one reason: we weren't discarding dead code in some cases. This patch
fixes it.
|
| |
|
|
|
|
| |
Fixes #8456
|
| |
|
|
|
|
|
|
|
| |
Fixes #8456. Previous version of control flow optimisations
did not update the list of block predecessors, leading to
unnecessary duplication of blocks in some cases. See Trac
and comments in the code for more details.
|
|
|
|
|
|
| |
The only substantive change here is to change "==" into ">=" in
the Note [Always false stack check] code. This is semantically
correct, but won't have any practical impact.
|
| |
|
|
|
|
|
| |
Fix a bug introduced in 94125c97e49987e91fa54da6c86bc6d17417f5cf.
See Note [Always false stack check]
|
|
|
|
|
|
|
| |
I am removing old loopification code that has been commented out
for long long time. We now have loopification implemented in
the code generator (see Note [Self-recursive tail calls]) so we
won't need to resurect this old code.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When compiling a function we can determine how much stack space it will
use. We therefore need to perform only a single stack check at the beginning
of a function to see if we have enough stack space. Instead of referring
directly to Sp - as we used to do in the past - the code generator uses
(old + 0) in the stack check. Stack layout phase turns (old + 0) into Sp.
The idea here is that, while we need to perform only one stack check for
each function, we could in theory place more stack checks later in the
function. They would be redundant, but not incorrect (in a sense that they
should not change program behaviour). We need to make sure however that a
stack check inserted after incrementing the stack pointer checks for a
respectively smaller stack space. This would not be the case if the code
generator produced direct references to Sp. By referencing (old + 0) we make
sure that we always check for a correct amount of stack: when converting
(old + 0) to Sp the stack layout phase takes into account changes already
made to stack pointer. The idea for this change came from observations made
while debugging #8275.
|
| |
|
|
|
|
| |
Signed-off-by: Erik de Castro Lopo <erikd@mega-nerd.com>
|
|
|
|
|
|
| |
This way CPP conditionals can be avoided for the transition period.
Signed-off-by: Herbert Valerio Riedel <hvr@gnu.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for several new primitive operations which
support using processor-specific instructions to help guide data and
cache locality decisions. We have levels ranging from [0..3]
For LLVM, we generate llvm.prefetch intrinsics at the proper locality
level (similar to GCC.)
For x86 we generate prefetch{NTA, t2, t1, t0} instructions. On SPARC and
PowerPC, the locality levels are ignored.
This closes #8256.
Authored-by: Carter Tazio Schonwald <carter.schonwald@gmail.com>
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
| |
On 32-bit platforms, the bitmap should be an array of
32-bit words, not Word64s.
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
|
|
|
| |
LLVM's GHC calling convention only allows 128-bit SIMD vectors to be passed in
machine registers on X86-64. This may change in LLVM 3.4; the hidden flag
-fllvm-pass-vectors-in-regs causes all SIMD vector widths to be passed in
registers on both X86-64 and on X86-32.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
width and element type.
SIMD primops are now polymorphic in vector size and element type, but
only internally to the compiler. More specifically, utils/genprimopcode
has been extended so that it "knows" about SIMD vectors. This allows us
to, for example, write a single definition for the "add two vectors"
primop in primops.txt.pp and have it instantiated at many vector types.
This generates a primop in GHC.Prim for each vector type at which "add
two vectors" is instantiated, but only one data constructor for the
PrimOp data type, so the code generator is much, much simpler.
|
|
|
|
|
|
|
|
|
|
|
| |
On x86-32, the C calling convention specifies that when SSE2 is enabled, vector
arguments are passed in xmm* registers; however, float and double arguments are
still passed on the stack. This patch allows us to make the same choice for
GHC. Even when SSE2 is enabled, we don't want to pass Float and Double arguments
in registers because this would change the ABI and break the ability to link
with code that was compiled without -msse2.
The next patch will enable passing vector arguments in xmm registers on x86-32.
|
| |
|
| |
|
|
|
|
| |
This makes it consistent with the corresponding -cmm-sink flag
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit does two things:
* Allows duplicating of global registers and literals by inlining
them. Previously we would only inline global register or literal
if it was used only once.
* Changes method of determining conflicts between a node and an
assignment. New method has two advantages. It relies on
DefinerOfRegs and UserOfRegs typeclasses, so if a set of registers
defined or used by a node should ever change, `conflicts` function
will use the changed definition. This definition also catches
more cases than the previous one (namely CmmCall and CmmForeignCall)
which is a step towards making it possible to run sinking pass
before stack layout (currently this doesn't work).
This patch also adds a lot of comments that are result of about two-week
long investigation of how sinking pass works and why it does what it does.
|
|
|
|
|
| |
Authored-by: David Luposchainsky <dluposchainsky@gmail.com>
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
|
|
|
|
| |
On some architectures it might happen that stack layout pass will
invalidate the list of calculated procpoints by dropping some of
them. We fix this by checking whether a proc-point is in a graph
at the beginning of proc-point analysis. This is a speculative
fix for #8205.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch encompasses most of the basic infrastructure for GHCJS. It
includes:
* A new extension, -XJavaScriptFFI
* A new architecture, ArchJavaScript
* Parser and lexer support for 'foreign import javascript', only
available under -XJavaScriptFFI, using ArchJavaScript.
* As a knock-on, there is also a new 'WayCustom' constructor in
DynFlags, so clients of the GHC API can add custom 'tags' to their
built files. This should be useful for other users as well.
The remaining changes are really just the resulting fallout, making sure
all the cases are handled appropriately for DynFlags and Platform.
Authored-by: Luite Stegeman <stegeman@gmail.com>
Signed-off-by: Austin Seipp <aseipp@pobox.com>
|
|
|
|
| |
And update comments
|
| |
|
| |
|
| |
|