| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
User visible changes
====================
Profilng
--------
Flags renamed (the old ones are still accepted for now):
OLD NEW
--------- ------------
-auto-all -fprof-auto
-auto -fprof-exported
-caf-all -fprof-cafs
New flags:
-fprof-auto Annotates all bindings (not just top-level
ones) with SCCs
-fprof-top Annotates just top-level bindings with SCCs
-fprof-exported Annotates just exported bindings with SCCs
-fprof-no-count-entries Do not maintain entry counts when profiling
(can make profiled code go faster; useful with
heap profiling where entry counts are not used)
Cost-centre stacks have a new semantics, which should in most cases
result in more useful and intuitive profiles. If you find this not to
be the case, please let me know. This is the area where I have been
experimenting most, and the current solution is probably not the
final version, however it does address all the outstanding bugs and
seems to be better than GHC 7.2.
Stack traces
------------
+RTS -xc now gives more information. If the exception originates from
a CAF (as is common, because GHC tends to lift exceptions out to the
top-level), then the RTS walks up the stack and reports the stack in
the enclosing update frame(s).
Result: +RTS -xc is much more useful now - but you still have to
compile for profiling to get it. I've played around a little with
adding 'head []' to GHC itself, and +RTS -xc does pinpoint the problem
quite accurately.
I plan to add more facilities for stack tracing (e.g. in GHCi) in the
future.
Coverage (HPC)
--------------
* derived instances are now coloured yellow if they weren't used
* likewise record field names
* entry counts are more accurate (hpc --fun-entry-count)
* tab width is now correct (markup was previously off in source with
tabs)
Internal changes
================
In Core, the Note constructor has been replaced by
Tick (Tickish b) (Expr b)
which is used to represent all the kinds of source annotation we
support: profiling SCCs, HPC ticks, and GHCi breakpoints.
Depending on the properties of the Tickish, different transformations
apply to Tick. See CoreUtils.mkTick for details.
Tickets
=======
This commit closes the following tickets, test cases to follow:
- Close #2552: not a bug, but the behaviour is now more intuitive
(test is T2552)
- Close #680 (test is T680)
- Close #1531 (test is result001)
- Close #949 (test is T949)
- Close #2466: test case has bitrotted (doesn't compile against current
version of vector-space package)
|
| |
| |
| |
| | |
Enables people to turn them on/off. Defaults to on.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Eventlog timestamps are elapsed times (in nanoseconds) relative to the
process start. To be able to merge eventlogs from multiple processes we
need to be able to align their timelines. If they share a clock domain
(or a user judges that their clocks are sufficiently closely
synchronised) then it is sufficient to know how the eventlog timestamps
match up with the clock.
The EVENT_WALL_CLOCK_TIME contains the clock time with (up to)
nanosecond precision. It is otherwise an ordinary event and so contains
the usual timestamp for the same moment in time. It therefore enables
us to match up all the eventlog timestamps with clock time.
|
|\ \
| | |
| | |
| | |
| | | |
Conflicts:
compiler/utils/Platform.hs
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
On second thoughts, hs_free_stable_ptr() is the official way to free a
StablePtr.
This reverts commit ae583f2949570755c8a03f68a71416c0fd7f257c.
|
|/ / |
|
| | |
|
|/
|
|
| |
See Note [atomic CAFs] in rts/sm/Storage.c
|
|
|
|
|
|
|
|
|
|
|
| |
This parameter controls the allowed depth of reasoning in the
type constraint solver. Perfectly well-behaved programs can
use deep stacks, and 20 is obviously too small. (Indeed, if
you don't have UndecidableInstances, the constraint solver
is supposed to terminate, so no limit should be needed.)
Responding to Trac #5395 this patch increases the default
to 200.
|
| |
|
|
|
|
| |
(#5402)
|
|
|
|
|
|
|
| |
When the bootstrap compiler does not include this patch, you must add this line
to mk/build.mk, otherwise the ARM architecture cannot be detected due to a
-undef option given to the C pre-processor.
SRC_HC_OPTS = -pgmP 'gcc -E -traditional'
|
|
|
|
|
|
| |
This patch fixes RTS' xchg and cas functions. On ARMv7 it is recommended
to add memory barrier after using ldrex/strex for implementing atomic
lock or operation.
|
|
|
|
|
|
| |
This patch provides implementation of ARMv7 specific memory barriers.
It uses dmb sy isn (or shortly dmb) for store/load and load/load barriers
and dmb st isn for store/store barrier.
|
|
|
|
|
|
|
| |
This patch adds mapping for STG floating point registers
using ARM VFPv3. Since I'm using just d8-d11 also processors
with just VFPv3-D16 implemented should work (e.g. NVidia Tegra2,
Marvell Dove)
|
| |
|
| |
|
|
|
|
|
|
| |
This is the Stephen Blackheath's GHC/ARM registerised port
which is using modified version of LLVM and which provides
basic registerised build functionality
|
|
|
|
| |
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
|
|
|
|
|
|
|
|
|
|
|
| |
We add a new RTS flag -T for collecting statistics but not giving any
new inputs. There is one new struct in rts/storage/GC.h: GCStats. We
add two new global counters current_residency and current_slop, which
are useful for in-program GC statistics.
See GHC.Stats in base for a Haskell interface to this functionality.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Replaces the existing EVENT_RUN/STEAL_SPARK events with 7 new events
covering all stages of the spark lifcycle:
create, dud, overflow, run, steal, fizzle, gc
The sampled spark events are still available. There are now two event
classes for sparks, the sampled and the fully accurate. They can be
enabled/disabled independently. By default +RTS -l includes the sampled
but not full detail spark events. Use +RTS -lf-p to enable the detailed
'f' and disable the sampled 'p' spark.
Includes work by Mikolaj <mikolaj.konarski@gmail.com>
|
|
|
|
|
|
|
| |
Previously GC was included in the scheduler trace class. It can be
enabled specifically with +RTS -vg or -lg, though note that both -v
and -l on their own now default to a sensible set of trace classes,
currently: scheduler, gc and sparks.
|
| |
|
|
|
|
|
|
|
| |
A new eventlog event containing 7 spark counters/statistics: sparks
created, dud, overflowed, converted, GC'd, fizzled and remaining.
These are maintained and logged separately for each capability.
We log them at startup, on each GC (minor and major) and on shutdown.
|
|
|
|
| |
to using MD5 hashes to identify TypeReps in the Typeable library.
|
|
|
|
|
| |
The process ID, parent process ID, rts name and version
The program arguments and environment.
|
|
|
|
|
|
| |
setupRtsFlags(), rather than sharing the memory. Previously if the
caller of hs_init() passed in dynamically-allocated memory and then
freed it, random crashes could happen later (#5177).
|
|
|
|
| |
in the future.
|
|
|
|
|
|
|
|
| |
Coutts."
This reverts commit 58532eb46041aec8d4cbb48b054cb5b001edb43c.
Turns out it didn't work on Windows and it'll need some non-trivial changes
to make it work on Windows. We'll get it in later once that's sorted out.
|
| |
|
| |
|
| |
|
|
|
|
| |
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
|
| |
|
|
|
|
|
| |
This is more pleasant than having the C generator check whether the
function it's calling is cas, and not generate a prototype if so.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously the code generator generated small code fragments labelled
with __stginit_M for each module M, and these performed whatever
initialisation was necessary for that module and recursively invoked
the initialisation functions for imported modules. This appraoch had
drawbacks:
- FFI users had to call hs_add_root() to ensure the correct
initialisation routines were called. This is a non-standard,
and ugly, API.
- unless we were using -split-objs, the __stginit dependencies would
entail linking the whole transitive closure of modules imported,
whether they were actually used or not. In an extreme case (#4387,
#4417), a module from GHC might be imported for use in Template
Haskell or an annotation, and that would force the whole of GHC to
be needlessly linked into the final executable.
So now instead we do our initialisation with C functions marked with
__attribute__((constructor)), which are automatically invoked at
program startup time (or DSO load-time). The C initialisers are
emitted into the stub.c file. This means that every time we compile
with -prof or -hpc, we now get a stub file, but thanks to #3687 that
is now invisible to the user.
There are some refactorings in the RTS (particularly for HPC) to
handle the fact that initialisers now get run earlier than they did
before.
The __stginit symbols are still generated, and the hs_add_root()
function still exists (but does nothing), for backwards compatibility.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This code has accumulated a great deal of cruft over the years, this
pass cleans up a lot of the surrounding cruft but leaves the actual
argument processing alone - so there's still more that could be done.
Bug fixed:
- ghc_rts_opts should not be subject to the --rtsopts setting. If
the programmer explicitly declares options with ghc_rts_opts, they
shouldn't also have to accept command-line RTS options to make them
work.
|
| |
|
| |
|
|
|
|
|
| |
This isn't important, but it stops us getting [...]/./[...] in the paths
in bindists.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
Now we keep any partially-full blocks in the gc_thread[] structs after
each GC, rather than moving them to the generation. This should give
us slightly better locality (though I wasn't able to measure any
difference).
Also in this patch: better sanity checking with THREADED.
|
|
|
|
|
|
| |
Store the *number* of the destination generation in the Bdescr struct,
so that in evacuate() we don't have to deref gen to get it.
This is another improvement ported over from my GC branch.
|
| |
|
| |
|