| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The main change here is that the Cmm parser now allows high-level cmm
code with argument-passing and function calls. For example:
foo ( gcptr a, bits32 b )
{
if (b > 0) {
// we can make tail calls passing arguments:
jump stg_ap_0_fast(a);
}
return (x,y);
}
More details on the new cmm syntax are in Note [Syntax of .cmm files]
in CmmParse.y.
The old syntax is still more-or-less supported for those occasional
code fragments that really need to explicitly manipulate the stack.
However there are a couple of differences: it is now obligatory to
give a list of live GlobalRegs on every jump, e.g.
jump %ENTRY_CODE(Sp(0)) [R1];
Again, more details in Note [Syntax of .cmm files].
I have rewritten most of the .cmm files in the RTS into the new
syntax, except for AutoApply.cmm which is generated by the genapply
program: this file could be generated in the new syntax instead and
would probably be better off for it, but I ran out of enthusiasm.
Some other changes in this batch:
- The PrimOp calling convention is gone, primops now use the ordinary
NativeNodeCall convention. This means that primops and "foreign
import prim" code must be written in high-level cmm, but they can
now take more than 10 arguments.
- CmmSink now does constant-folding (should fix #7219)
- .cmm files now go through the cmmPipeline, and as a result we
generate better code in many cases. All the object files generated
for the RTS .cmm files are now smaller. Performance should be
better too, but I haven't measured it yet.
- RET_DYN frames are removed from the RTS, lots of code goes away
- we now have some more canned GC points to cover unboxed-tuples with
2-4 pointers, which will reduce code size a little.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current fix is relatively dumb as far as where to add HpLim
checks: it will always perform a check unless we know that we're
returning from a closure or we are doing a non let-no-escape case
analysis. The performance impact on the nofib suite looks like this:
Min +5.7% -0.0% -6.5% -6.4% -50.0%
Max +6.3% +5.8% +5.0% +5.5% +0.8%
Geometric Mean +6.2% +0.1% +0.5% +0.5% -0.8%
Overall, the executable bloat is the biggest problem, so we keep the old
omit-yields optimization on by default. Remember that if you need an
interruptibility guarantee, you need to recompile all of your libraries
with -fno-omit-yields.
A better fix would involve only inserting the yields necessary to break
loops; this is left as future work.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
| |
StgWord is a newtyped Word64, as it needed to be something that
has a UArray instance.
|
|
|
|
|
| |
It's now a newtyped Integer. Perhaps a newtyped Word32 would make more
sense, though.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
This frees wORD_SIZE up to be moved out of HaskellConstants
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
| |
I've switched to passing DynFlags rather than Platform, as (a) it's
simpler to not have to extract targetPlatform in so many places, and
(b) it may be useful to have DynFlags around in future.
|
| |
|
| |
|
|
|
|
| |
We don't actually use it yet
|
| |
|
|
|
|
|
|
| |
We need to make the SRT label external and unique when splitting,
because it is shared amongst all the functions in the module. Also
some SRT-related cleanup.
|
|
|
|
| |
(this change was previously done in the old codegen only)
|
| |
|
|
|
|
| |
It's now just 'dopt Opt_Ticky'
|
|
|
|
|
|
|
| |
Fixes cgrun071 on recent Mac OS X versions.
This is the right fix at least until we have proper types for Word8#,
Word16# etc.
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
HaskellMachRegs.h is no longer included in anything under compiler/
Also, includes/CodeGen.Platform.hs now includes "stg/MachRegs.h"
rather than <stg/MachRegs.h> which means that we always get the file
from the tree, rather than from the bootstrapping compiler.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
Simon Marlow spotted that we were #include'ing MachRegs.h several times,
but that doesn't work as (a) it uses ifdeffery to avoid being included
multiple times, and (b) even if we work around that, then the #define's
from previous inclusions are still defined when we #include it again.
So we now put the platform code for each platform in a separate .hs file.
|
| |
|
| |
|
| |
|