| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
| |
This patch provides implementation of ARMv7 specific memory barriers.
It uses dmb sy isn (or shortly dmb) for store/load and load/load barriers
and dmb st isn for store/store barrier.
|
|
|
|
|
|
|
| |
This patch adds mapping for STG floating point registers
using ARM VFPv3. Since I'm using just d8-d11 also processors
with just VFPv3-D16 implemented should work (e.g. NVidia Tegra2,
Marvell Dove)
|
| |
|
| |
|
|
|
|
|
|
| |
This is the Stephen Blackheath's GHC/ARM registerised port
which is using modified version of LLVM and which provides
basic registerised build functionality
|
|
|
|
| |
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
|
|
|
|
|
|
|
|
|
|
|
| |
We add a new RTS flag -T for collecting statistics but not giving any
new inputs. There is one new struct in rts/storage/GC.h: GCStats. We
add two new global counters current_residency and current_slop, which
are useful for in-program GC statistics.
See GHC.Stats in base for a Haskell interface to this functionality.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Replaces the existing EVENT_RUN/STEAL_SPARK events with 7 new events
covering all stages of the spark lifcycle:
create, dud, overflow, run, steal, fizzle, gc
The sampled spark events are still available. There are now two event
classes for sparks, the sampled and the fully accurate. They can be
enabled/disabled independently. By default +RTS -l includes the sampled
but not full detail spark events. Use +RTS -lf-p to enable the detailed
'f' and disable the sampled 'p' spark.
Includes work by Mikolaj <mikolaj.konarski@gmail.com>
|
|
|
|
|
|
|
| |
Previously GC was included in the scheduler trace class. It can be
enabled specifically with +RTS -vg or -lg, though note that both -v
and -l on their own now default to a sensible set of trace classes,
currently: scheduler, gc and sparks.
|
| |
|
|
|
|
|
|
|
| |
A new eventlog event containing 7 spark counters/statistics: sparks
created, dud, overflowed, converted, GC'd, fizzled and remaining.
These are maintained and logged separately for each capability.
We log them at startup, on each GC (minor and major) and on shutdown.
|
|
|
|
| |
to using MD5 hashes to identify TypeReps in the Typeable library.
|
|
|
|
|
| |
The process ID, parent process ID, rts name and version
The program arguments and environment.
|
|
|
|
|
|
| |
setupRtsFlags(), rather than sharing the memory. Previously if the
caller of hs_init() passed in dynamically-allocated memory and then
freed it, random crashes could happen later (#5177).
|
|
|
|
| |
in the future.
|
|
|
|
|
|
|
|
| |
Coutts."
This reverts commit 58532eb46041aec8d4cbb48b054cb5b001edb43c.
Turns out it didn't work on Windows and it'll need some non-trivial changes
to make it work on Windows. We'll get it in later once that's sorted out.
|
| |
|
| |
|
| |
|
|
|
|
| |
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
|
| |
|
|
|
|
|
| |
This is more pleasant than having the C generator check whether the
function it's calling is cas, and not generate a prototype if so.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously the code generator generated small code fragments labelled
with __stginit_M for each module M, and these performed whatever
initialisation was necessary for that module and recursively invoked
the initialisation functions for imported modules. This appraoch had
drawbacks:
- FFI users had to call hs_add_root() to ensure the correct
initialisation routines were called. This is a non-standard,
and ugly, API.
- unless we were using -split-objs, the __stginit dependencies would
entail linking the whole transitive closure of modules imported,
whether they were actually used or not. In an extreme case (#4387,
#4417), a module from GHC might be imported for use in Template
Haskell or an annotation, and that would force the whole of GHC to
be needlessly linked into the final executable.
So now instead we do our initialisation with C functions marked with
__attribute__((constructor)), which are automatically invoked at
program startup time (or DSO load-time). The C initialisers are
emitted into the stub.c file. This means that every time we compile
with -prof or -hpc, we now get a stub file, but thanks to #3687 that
is now invisible to the user.
There are some refactorings in the RTS (particularly for HPC) to
handle the fact that initialisers now get run earlier than they did
before.
The __stginit symbols are still generated, and the hs_add_root()
function still exists (but does nothing), for backwards compatibility.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This code has accumulated a great deal of cruft over the years, this
pass cleans up a lot of the surrounding cruft but leaves the actual
argument processing alone - so there's still more that could be done.
Bug fixed:
- ghc_rts_opts should not be subject to the --rtsopts setting. If
the programmer explicitly declares options with ghc_rts_opts, they
shouldn't also have to accept command-line RTS options to make them
work.
|
| |
|
| |
|
|
|
|
|
| |
This isn't important, but it stops us getting [...]/./[...] in the paths
in bindists.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
Now we keep any partially-full blocks in the gc_thread[] structs after
each GC, rather than moving them to the generation. This should give
us slightly better locality (though I wasn't able to measure any
difference).
Also in this patch: better sanity checking with THREADED.
|
|
|
|
|
|
| |
Store the *number* of the destination generation in the Bdescr struct,
so that in evacuate() we don't have to deref gen to get it.
This is another improvement ported over from my GC branch.
|
| |
|
| |
|
|
|
|
| |
Now that we use the per-capability mutable lists exclusively.
|
|
|
|
| |
It is still (silently) accepted for backwards compatibility.
|
|
|
|
|
|
|
|
|
| |
So we can now get these in ThreadScope:
19487000: cap 1: stopping thread 6 (blocked on black hole owned by thread 4)
Note: needs an update to ghc-events. Older ThreadScopes will just
ignore the new information.
|
|
|
|
|
|
|
| |
Note that some things depending on the rts/includes header files now
depend on more files: They used to include depend on includes/*.h, but
now they also depend on header files in subdirectories. As far as I can
see this was a bug.
|
| |
|
|
|
|
|
|
| |
They're a little nicer now, and a regression in the cygwin build is
fixed (the $i in the destination wasn't surviving being passed through
cygpath).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
cygwin's /bin/install doesn't set file modes correctly if the
destination path is a C: style path:
$ /bin/install -c -m 644 foo /cygdrive/c/cygwin/home/ian/foo2
$ /bin/install -c -m 644 foo c:/cygwin/home/ian/foo3
$ ls -l foo*
-rw-r--r-- 1 ian None 0 2011-01-06 18:28 foo
-rw-r--r-- 1 ian None 0 2011-01-06 18:29 foo2
-rwxrwxrwx 1 ian None 0 2011-01-06 18:29 foo3
This causes problems for bindisttest/checkBinaries.sh which then
thinks that e.g. the userguide HTML files are binaries.
We therefore use a /cygdrive path if we are on cygwin
|
|
|
|
|
|
|
|
|
|
|
| |
The allocation stats (+RTS -s etc.) used to count the slop at the end
of each nursery block (except the last) as allocated space, now we
count the allocated words accurately. This should make allocation
figures more predictable, too.
This has the side effect of reducing the apparent allocations by a
small amount (~1%), so remember to take this into account when looking
at nofib results.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes two changes to the way stacks are managed:
1. The stack is now stored in a separate object from the TSO.
This means that it is easier to replace the stack object for a thread
when the stack overflows or underflows; we don't have to leave behind
the old TSO as an indirection any more. Consequently, we can remove
ThreadRelocated and deRefTSO(), which were a pain.
This is obviously the right thing, but the last time I tried to do it
it made performance worse. This time I seem to have cracked it.
2. Stacks are now represented as a chain of chunks, rather than
a single monolithic object.
The big advantage here is that individual chunks are marked clean or
dirty according to whether they contain pointers to the young
generation, and the GC can avoid traversing clean stack chunks during
a young-generation collection. This means that programs with deep
stacks will see a big saving in GC overhead when using the default GC
settings.
A secondary advantage is that there is much less copying involved as
the stack grows. Programs that quickly grow a deep stack will see big
improvements.
In some ways the implementation is simpler, as nothing special needs
to be done to reclaim stack as the stack shrinks (the GC just recovers
the dead stack chunks). On the other hand, we have to manage stack
underflow between chunks, so there's a new stack frame
(UNDERFLOW_FRAME), and we now have separate TSO and STACK objects.
The total amount of code is probably about the same as before.
There are new RTS flags:
-ki<size> Sets the initial thread stack size (default 1k) Egs: -ki4k -ki2m
-kc<size> Sets the stack chunk size (default 32k)
-kb<size> Sets the stack chunk buffer size (default 1k)
-ki was previously called just -k, and the old name is still accepted
for backwards compatibility. These new options are documented.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a temporary measure until we fix the bug properly (which is
somewhat tricky, and we think might be easier in the new code
generator).
For now we get:
ghc-stage2: sorry! (unimplemented feature or known bug)
(GHC version 7.1 for i386-unknown-linux):
Trying to allocate more than 1040384 bytes.
See: http://hackage.haskell.org/trac/ghc/ticket/4550
Suggestion: read data from a file instead of having large static data
structures in the code.
|
| |
|