| Commit message (Collapse) | Author | Age | Files | Lines |
| | |
|
| |
|
|
|
|
|
|
|
|
|
| |
We used to pass the list of top-level foreign exported bindings to the
code generator so that it could create StablePtrs for them in the
stginit code. Now we don't use stginit unless profiling, and the
StablePtrs are generated by C functions marked with
attribute((constructor)). This patch removes various bits associated
with the old way of doing things, which were previously left in place
in case we wanted to switch back, I presume. Also I refactored
dsForeigns to clean it up a bit.
|
| |
|
|
|
| |
This was causing us to try to jump to the address of an infotable when
unregisterised, leading to a segfault.
|
| | |
|
| | |
|
| |
|
|
|
| |
Consistently make the type and description in the info table an offset
or a pointer, depending on whether tables are next to code or not.
|
| |
|
|
|
|
| |
This patch should have no effect; it's mainly comments, layout,
plus this contructor name change.
|
| |
|
|
|
|
|
|
| |
Defaulting makes compilation of multiple modules more complicated (re: #1405)
Although it was all locally within functions, not because of the module
monomorphism-restriction... but it's better to be clear what's meant, anyway.
I changed some that were defaulting to Integer, to explicit Int, where Int
seemed appropriate rather than Integer.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of attaching the information whether a Label is going to be
accessed dynamically or not (distinction between IdLabel/DynLabel and
additional flags in ModuleInitLabel and PlainModuleInitLabel), we hand
dflags through the CmmOpt monad and the NatM monad. Before calling
labelDynamic in PositionIndependentCode, we extract thisPackage from
dflags and supply the current package to labelDynamic, so it can take
this information into account instead of extracting it from the labels
itself. This simplifies a lot of code in codeGen that just hands
through this_pkg.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements pointer tagging as per our ICFP'07 paper "Faster
laziness using dynamic pointer tagging". It improves performance by
10-15% for most workloads, including GHC itself.
The original patches were by Alexey Rodriguez Yakushev
<mrchebas@gmail.com>, with additions and improvements by me. I've
re-recorded the development as a single patch.
The basic idea is this: we use the low 2 bits of a pointer to a heap
object (3 bits on a 64-bit architecture) to encode some information
about the object pointed to. For a constructor, we encode the "tag"
of the constructor (e.g. True vs. False), for a function closure its
arity. This enables some decisions to be made without dereferencing
the pointer, which speeds up some common operations. In particular it
enables us to avoid costly indirect jumps in many cases.
More information in the commentary:
http://hackage.haskell.org/trac/ghc/wiki/Commentary/Rts/HaskellExecution/PointerTagging
|
| | |
|
| |
|
|
|
|
|
| |
* Fix code output order when printing C so things are defined before
they are used.
* Generate _ret rather than _entry functions for INFO_TABLE_RET.
* Use "ASSIGN_BaseReg" rather than "BaseReg =".
|
| |
|
|
|
| |
This is needed because CgForeign and parts of the CPS pass now use
'callerSaveVolatileRegs' and not all platforms have access to the NCG.
|
| |
|
|
|
|
|
|
|
|
| |
Now, if a single module *anywhere* on the module tree is built with
-fhpc, the binary will enable reading/writing of <bin>.tix.
Previously, you needed to compile Main to allow coverage to operate.
This changes the file format for .hi files; you will need to recompile every library.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
We store .mix files in
.hpc/<package>/<the.module.name>.mix
The package main the empty package (aka other naming passes),
so Main just is stored in
.hpc/Main.tix
This change in backwards compatable.
|
| | |
|
| | |
|
| |
|
|
|
| |
mapAccumL and mapAccumR are in Data.List now.
mapAccumB is unused.
|
| |
|
|
| |
(This required a bit of refactoring of CmmInfo.)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These include:
- Stack size detection now includes function arguments.
- Stack size detection now avoids stack checks just because of
the GC block.
- A CmmCall followed by a CmmBranch will no longer generate an extra
continuation consisting just of the brach.
- Multiple CmmCall/CmmBranch pairs that all go to the same place
will try to use the same continuation. If they can't (because
the return value signature is different), adaptor block are built.
- Function entry statements are now in a separate block.
(Fixed bug with branches to the entry block having unintended effects.)
- Other changes that I can't recall right now.
|
| | |
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This eliminates one of the panics introduced by
the previous patch:
'First pass at implementing info tables for CPS'
The other panic introduced by that patch still remains.
It was due to the need to convert from a
ContinuationInfo to a CmmInfo.
(codeGen/CgInfoTbls.hs:emitClosureCodeAndInfoTable)
(codeGen/CgInfoTbls.hs:emitReturnTarget)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a fairly complete implementation, however
two 'panic's have been placed in the critical path
where the implementation is still a bit lacking so
do not expect it to run quite yet.
One call to panic is because we still need to create
a GC block for procedures that don't have them yet.
(cmm/CmmCPS.hs:continuationToProc)
The other is due to the need to convert from a
ContinuationInfo to a CmmInfo.
(codeGen/CgInfoTbls.hs:emitClosureCodeAndInfoTable)
(codeGen/CgInfoTbls.hs:emitReturnTarget)
|
| | |
|
| |
|
|
|
|
| |
This version should compile but is still incomplete as it introduces
potential bugs at the places marked 'TODO FIXME NOW'.
It is being recorded to help keep track of changes.
|
| |
|
|
|
|
| |
The return values were getting put in a LocalReg
but that LocalReg needed to be stored into the enclosing
expression's return register (e.g. R1).
|
| | |
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This frees the Cmm data type from keeping a list of live global registers
in CmmCall which helps prepare for the CPS conversion phase.
CPS conversion does its own liveness analysis and takes input that should
not directly refer to parameter registers (e.g. R1, F5, D3, L2). Since
these are the only things which could occur in the live global register
list, CPS conversion makes that field of the CmmCall constructor obsolite.
Once the CPS conversion pass is fully implemented, global register saving
will move from codeGen into the CPS pass. Until then, this patch
is worth scrutinizing and testing to ensure it doesn't cause any performance
or correctness problems as the code passed to the backends by the CPS
converting will look very similar to the code that this patch makes codeGen
pass to the backend.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Since a CmmCall returns CmmFormals which may include
global registers (and indeed one place in the code
returns the results of a CmmCall into BaseReg) and
since CPS conversion will change those return slots
into formal arguments for the continuation of the call,
CmmProc has to have CmmFormals for the formal arguments.
Oddly, the old code never made use of procedure arguments
so this change only effects the types and not any of the code.
(Because [] is both of type [LocalReg] and CmmFormals.)
|
| | |
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
When the con_desc field of an info table was made into a relative
reference, this had the side effect of making the profiling fields
(closure_desc and closure_type) also relative, but only when compiling
via C, and the heap profiler was still treating them as absolute,
leading to crashes when profiling with -hd or -hy.
This patch fixes up the story to be consistent: these fields really
should be relative (otherwise we couldn't make shared versions of the
profiling libraries), so I've made them relative and fixed up the RTS
to know about this.
|
| | |
|
| | |
|
| |
|
|
| |
This has been a long-standing ToDo.
|
| | |
|
| | |
|
| |
|
|
|
|
|
|
|
| |
- .tix files are now a list of MixModule, which contain a hash of the contents of the .mix file.
- .mix files now have (the same) hash number.
This changes allow different binaries that use the same module compiled in the same way
to share coverage information.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ticky StgEntCounter structure was trying to be clever by using a
fixed-width 32-bit field for the registeredp value. But the code generators
are not up to handling structures packed tightly like this (on a 64-bit
architecture); result seg-fault on 64-bit.
Really there should be some complaint from the code generators, not simply
a seg fault.
Anyway I switched to using native words for StgEntCounter fields, and
now at least it works.
|
| |
|
|
|
|
|
|
|
|
| |
Info tables, like everything else in the text section, MUST NOT contain
pointers. A pointer is, by definition, position dependent and is therefore
fundamentally incompatible with generating position independent code.
Therefore, we have to store an offset from the info label to the string
instead of the pointer, just as we already did for other things referred
to by the info table (SRTs, large bitmaps, etc.)
|
| |
|
|
|
| |
We recently discovered that they aren't a win any more, and just cost
code size.
|
| |
|
|
|
|
|
| |
This patch adds data constructor names into their info tables.
This is useful in the ghci debugger. It replaces the old scheme which
was based on tracking data con names in the linker.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The following changes restore ticky-ticky profiling to functionality
from its formerly bit-rotted state. Sort of. (It got bit-rotted as part
of the switch to the C-- back-end.)
The way that ticky-ticky is supposed to work is documented in Section 5.7
of the GHC manual (though the manual doesn't mention that it hasn't worked
since sometime around 6.0, alas). Changes from this are as follows (which
I'll document on the wiki):
* In the past, you had to build all of the libraries with way=t in order to
use ticky-ticky, because it entailed a different closure layout. No longer.
You still need to do make way=t in rts/ in order to build the ticky RTS,
but you should now be able to mix ticky and non-ticky modules.
* Some of the counters that worked in the past aren't implemented yet.
I was originally just trying to get entry counts to work, so those should
be correct. The list of counters was never documented in the first place,
so I hope it's not too much of a disaster that some don't appear anymore.
Someday, someone (perhaps me) should document all the counters and what
they do. For now, all of the counters are either accurate (or at least as
accurate as they always were), zero, or missing from the ticky profiling
report altogether.
This hasn't been particularly well-tested, but these changes shouldn't
affect anything except when compiling with -fticky-ticky (famous last
words...)
Implementation details:
I got rid of StgTicky.h, which in the past had the macros and declarations
for all of the ticky counters. Now, those macros are defined in Cmm.h.
StgTicky.h was still there for inclusion in C code. Now, any remaining C
code simply cannot call the ticky macros -- or rather, they do call those
macros, but from the perspective of C code, they're defined as no-ops.
(This shouldn't be too big a problem.)
I added a new file TickyCounter.h that has all the declarations for ticky
counters, as well as dummy macros for use in C code. Someday, these
declarations should really be automatically generated, since they need
to be kept consistent with the macros defined in Cmm.h.
Other changes include getting rid of the header that was getting added to
closures before, and getting rid of various code having to do with eager
blackholing and permanent indirections (the changes under compiler/
and rts/Updates.*).
|
| |
|
|
|
|
| |
In the generated code for case-of-variable, test the tag of the
scrutinee closure and only enter if it is unevaluated. Also turn
*off* vectored returns.
|
| |
|
|
|
|
| |
In the generated code for case-of-variable, test the tag of the
scrutinee closure and only enter if it is unevaluated. Also turn
*off* vectored returns.
|
| | |
|
| |
|
|
| |
Only affects -fasm: gcc makes its own decisions about jump tables
|