| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
Return type was correct when TABLES_NEXT_TO_CODE was defined.
|
|
|
|
|
|
|
|
| |
All the wibble seem to have cancelled out, and (non-debug) object sizes
are back to where they started.
I'm not 100% sure that the types are optimal, but at least now the
functions have types and we can fix them if necessary.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This caused a couple of .o files to change size. I had a look at one,
and it seems to be caused by the difference in size of these two
instructions:
49 8b 5d 08 mov 0x8(%r13),%rbx
49 8b 5c 24 08 mov 0x8(%r12),%rbx
(with a few nops being added or removed later in the file, presumably
for alignment reasons).
|
|
|
|
|
|
|
|
|
|
|
|
| |
This has several advantages:
* It can be called from gdb
* There is more type information for the user, and type checking
for the compiler
* Less opportunity for things to go wrong, e.g. due to missing
parentheses or repeated execution
The sizes of the non-debug .o files hasn't changed (other than
Inlines.o), so I'm pretty sure the compiled code is identical.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
Simon Marlow spotted that we were #include'ing MachRegs.h several times,
but that doesn't work as (a) it uses ifdeffery to avoid being included
multiple times, and (b) even if we work around that, then the #define's
from previous inclusions are still defined when we #include it again.
So we now put the platform code for each platform in a separate .hs file.
|
| |
|
|
|
|
|
|
|
|
| |
This means that we now generate the same code whatever platform we are
on, which should help avoid changes on one platform breaking the build
on another.
It's also another step towards full cross-compilation.
|
|
|
|
| |
No functional differences yet
|
|
|
|
|
|
| |
We weren't defining it in the other places that MachRegs.h gets
imported, which seems a little suspicious. And if it's not defined
then it defaults to 4 anyway, so this define doesn't seem necessary.
|
| |
|
| |
|
| |
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
new file: ghc/ghc-cross.wrapper
new file: includes/mkDerivedConstants.cross.awk
new file: includes/mkSizeMacros.cross.awk
new file: rules/cross-compiling.mk
These are expected to sit quietly in the tree until
the rest of the machinery matures on an (upcoming)
branch. Reviews will begin to make sense as soon as
that has happened. Anyway, comments are welcome. See
<http://www.haskell.org/pipermail/cvs-ghc/2012-July/074456.html>
for background.
Disclaimer: these source files are not (yet) up to the
quality standards set by the rest of the tree.
Cleanups, move-arounds and rewrites (i.e. .awk -> .hs), as
well as additional comments and documentation will happen
as soon as the basic functionality of a cross-compiler is
working reliably.
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
also removed an unnecessary 'struct' tag (since the struct is
not recursive); this is in line with the other struct definitions
fixed a typo, updated copyright
it remains to remove the tabs and align the structure members
accordingly
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Based on initial patches by Mikolaj Konarski <mikolaj@well-typed.com>
These new eventlog events are to let profiling tools keep track of all
the OS threads that belong to an RTS capability at any moment in time.
In the RTS, OS threads correspond to the Task abstraction, so that is
what we track. There are events for tasks being created, migrated
between capabilities and deleted. In particular the task creation event
also records the kernel thread id which lets us match up the OS thread
with data collected by others tools (in the initial use case with
Linux's perf tool, but in principle also with DTrace).
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On most platforms the userspace thread type (e.g. pthread_t) and kernel
thread id are different. Normally we don't care about kernel thread Ids,
but some system tools for tracing/profiling etc report kernel ids.
For example Solaris and OSX's DTrace and Linux's perf tool report kernel
thread ids. To be able to match these up with RTS's OSThread we need a
way to get at the kernel thread, so we add a new function for to do just
that (the implementation is system-dependent).
Additionally, strictly speaking the OSThreadId type, used as task ids,
is not a serialisable representation. On unix OSThreadId is a typedef for
pthread_t, but pthread_t is not guaranteed to be a numeric type.
Indeed on some systems pthread_t is a pointer and in principle it
could be a structure type. So we add another new function to get a
serialisable representation of an OSThreadId. This is only for use
in log files. We use the function to serialise an id of a task,
with the extra feature that it works in non-threaded builds
by always returning 1.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|\ |
|
| | |
|
| | |
|
| | |
|
| |
| |
| |
| |
| | |
It turns out that we can use %zu and %llu on Win32, provided we
include PosixSource everywhere we want to use them.
|
| |
| |
| |
| |
| | |
OS X doesn't understand 'gnu_printf', so we need to onyl use it
conditionally.
|
| |
| |
| |
| |
| | |
On Win32 it's not recognised, so we unfortunately can't use it
unconditionally.
|
| | |
|
| |
| |
| |
| |
| |
| | |
Mostly this meant getting pointer<->int conversions to use the right
sizes. lnat is now size_t, rather than unsigned long, as that seems a
better match for how it's used.
|
| | |
|
| | |
|
|/
|
|
|
| |
On Windows, gcc thinks that printf means ms_printf, which is not the
case when we #define _POSIX_SOURCE 1.
|
|
|
|
|
|
|
| |
Firstly, we were rounding up too much, such that the smallest delay
was 20ms. Secondly, there is no need to use millisecond resolution on
a 64-bit machine where we have room in the TSO to use the normal
nanosecond resolution that we use elsewhere in the RTS.
|
|
|
|
|
|
|
|
|
| |
Quoting design rationale by dcoutts: The event indicates that we're doing
a stop-the-world GC and all other HECs should be between their GC_START
and GC_END events at that moment. We don't want to use GC_STATS_GHC
for that, because GC_STATS_GHC is for extra GHC-specific info,
not something we have to rely on to be able to match the GC pauses
across HECs to a particular global GC.
|
|
|
|
|
|
|
| |
The EventLogFormat.h described the spark counter fields in a different
order to that which ghc emits (the GC'd and fizzled fields were
reversed). At this stage it is easier to fix the ghc-events lib and
to have ghc continue to emit them in the current order.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
They cover much the same info as is available via the GHC.Stats module
or via the '+RTS -s' textual output, but via the eventlog and with a
better sampling frequency.
We have three new generic heap info events and two very GHC-specific
ones. (The hope is the general ones are usable by other implementations
that use the same eventlog system, or indeed not so sensitive to changes
in GHC itself.)
The general ones are:
* total heap mem allocated since prog start, on a per-HEC basis
* current size of the heap (MBlocks reserved from OS for the heap)
* current size of live data in the heap
Currently these are all emitted by GHC at GC time (live data only at
major GC).
The GHC specific ones are:
* an event giving various static heap paramaters:
* number of generations (usually 2)
* max size if any
* nursary size
* MBlock and block sizes
* a event emitted on each GC containing:
* GC generation (usually just 0,1)
* total bytes copied
* bytes lost to heap slop and fragmentation
* the number of threads in the parallel GC (1 for serial)
* the maximum number of bytes copied by any par GC thread
* the total number of bytes copied by all par GC threads
(these last three can be used to calculate an estimate of the
work balance in parallel GCs)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also rename internal variables to make the names match what they hold.
The parallel GC work balance is calculated using the total amount of
memory copied by all GC threads, and the maximum copied by any
individual thread. You have serial GC when the max is the same as
copied, and perfectly balanced GC when total/max == n_caps.
Previously we presented this as the ratio total/max and told users
that the serial value was 1 and the ideal value N, for N caps, e.g.
Parallel GC work balance: 1.05 (4045071 / 3846774, ideal 2)
The downside of this is that the user always has to keep in mind the
number of cores being used. Our new presentation uses a normalised
scale 0--1 as a percentage. The 0% means completely serial and 100%
is perfect balance, e.g.
Parallel GC work balance: 4.56% (serial 0%, perfect 100%)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that we can adjust the number of capabilities on the fly, we need
this reflected in the eventlog. Previously the eventlog had a single
startup event that declared a static number of capabilities. Obviously
that's no good anymore.
For compatability we're keeping the EVENT_STARTUP but adding new
EVENT_CAP_CREATE/DELETE. The EVENT_CAP_DELETE is actually just the old
EVENT_SHUTDOWN but renamed and extended (using the existing mechanism
to extend eventlog events in a compatible way). So we now emit both
EVENT_STARTUP and EVENT_CAP_CREATE. One day we will drop EVENT_STARTUP.
Since reducing the number of capabilities at runtime does not really
delete them, it just disables them, then we also have new events for
disable/enable.
The old EVENT_SHUTDOWN was in the scheduler class of events. The new
EVENT_CAP_* events are in the unconditional class, along with the
EVENT_CAPSET_* ones. Knowing when capabilities are created and deleted
is crucial to making sense of eventlogs, you always want those events.
In any case, they're extremely low volume.
|
|\ |
|
| |
| |
| |
| |
| | |
We were comparing ALIGNMENT_DOUBLE to ALIGNMENT_LONG, but really
we cared about W_ values, and sizeof(long) /= sizeof(void *) on Win64
|
| | |
|