| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were keeping around the Task struct (216 bytes) for every worker we
ever created, even though we only keep a maximum of 6 workers per
Capability. These Task structs accumulate and cause a space leak in
programs that do lots of safe FFI calls; this patch frees the Task
struct as soon as a worker exits.
One reason we were keeping the Task structs around is because we print
out per-Task timing stats in +RTS -s, but that isn't terribly useful.
What is sometimes useful is knowing how *many* Tasks there were. So
now I'm printing a single-line summary, this is for the program in
TASKS: 2001 (1 bound, 31 peak workers (2000 total), using -N1)
So although we created 2k tasks overall, there were only 31 workers
active at any one time (which is exactly what we expect: the program
makes 30 safe FFI calls concurrently).
This also gives an indication of how many capabilities were being
used, which is handy if you use +RTS -N without an explicit number.
|
|
|
|
|
| |
At present the number of capabilities can only be *increased*, not
decreased. The latter presents a few more challenges!
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Terminology cleanup: the type "Ticks" has been renamed "Time", which
is an StgWord64 in units of TIME_RESOLUTION (currently nanoseconds).
The terminology "tick" is now used consistently to mean the interval
between timer signals.
The ticker now always ticks in realtime (actually CLOCK_MONOTONIC if
we have it). Before it used CPU time in the non-threaded RTS and
realtime in the threaded RTS, but I've discovered that the CPU timer
has terrible resolution (at least on Linux) and isn't much use for
profiling. So now we always use realtime. This should also fix
The default tick interval is now 10ms, except when profiling where we
drop it to 1ms. This gives more accurate profiles without affecting
runtime too much (<1%).
Lots of cleanups - the resolution of Time is now in one place
only (Rts.h) rather than having calculations that depend on the
resolution scattered all over the RTS. I hope I found them all.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LLVM does not support the __thread attribute for thread
local storage and may generate incorrect code for global
register variables. We want to allow building the runtime with
LLVM-based compilers such as llvm-gcc and clang,
particularly for MacOS.
This patch changes the gct variable used by the garbage
collector to use pthread_getspecific() for thread local
storage when an llvm based compiler is used to build the
runtime.
|
|
|
|
|
|
|
|
|
|
|
| |
Based on a patch from David Terei.
Some parts are a little ugly (e.g. defining things that only ASSERTs
use only when DEBUG is defined), so we might want to tweak things a
little.
I've also turned off -Werror for didn't-inline warnings, as we now
get a few such warnings.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a port of some of the changes from my private local-GC branch
(which is still in darcs, I haven't converted it to git yet). There
are a couple of small functional differences in the GC stats: first,
per-thread GC timings should now be more accurate, and secondly we now
report average and maximum pause times. e.g. from minimax +RTS -N8 -s:
Tot time (elapsed) Avg pause Max pause
Gen 0 2755 colls, 2754 par 13.16s 0.93s 0.0003s 0.0150s
Gen 1 769 colls, 769 par 3.71s 0.26s 0.0003s 0.0059s
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The bug in this case was that we had a worker thread making a foreign
call which invoked a callback (in this case it was performGC, I
think). When the callback ended, boundTaskExiting() was setting
task->stopped, but the Task is now per-OS-thread, so it is shared by
the worker that made the original foreign call. When the foreign call
returned, because task->stopped was set, the worker was not placed on
the queue of spare workers. Somehow the worker woke up again, and
found the spare_workers queue empty, which lead to a crash.
Two bugs here: task->stopped should not have been set by
boundTaskExiting (this broke when I split the Task and InCall structs,
in 6.12.2), and releaseCapabilityAndQueueWorker() should not be
testing task->stopped anyway, because it should only ever be called
when task->stopped is false (this is now an assertion).
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is patch that adds support for interruptible FFI calls in the form
of a new foreign import keyword 'interruptible', which can be used
instead of 'safe' or 'unsafe'. Interruptible FFI calls act like safe
FFI calls, except that the worker thread they run on may be interrupted.
Internally, it replaces BlockedOnCCall_NoUnblockEx with
BlockedOnCCall_Interruptible, and changes the behavior of the RTS
to not modify the TSO_ flags on the event of an FFI call from
a thread that was interruptible. It also modifies the bytecode
format for foreign call, adding an extra Word16 to indicate
interruptibility.
The semantics of interruption vary from platform to platform, but the
intent is that any blocking system calls are aborted with an error code.
This is most useful for making function calls to system library
functions that support interrupting. There is no support for pre-Vista
Windows.
There is a partner testsuite patch which adds several tests for this
functionality.
|
|
|
|
| |
Which entailed fixing an incorrect #ifdef in Task.c
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In GHC 6.12.x I found a rare deadlock caused by this
lock-order-reversal:
AQ cap->lock
startWorkerTask
newTask
AQ sched_mutex
scheduleCheckBlackHoles
AQ sched_mutex
unblockOne_
wakeupThreadOnCapabilty
AQ cap->lock
so sched_mutex and cap->lock are taken in a different order in two
places.
This doesn't happen in the HEAD because we don't have
scheduleCheckBlackHoles, but I thought it would be prudent to make
this less likely to happen in the future by using a different mutex in
newTask. We can clearly see that the all_tasks mutex cannot be
involved in a deadlock, becasue we never call anything else while
holding it.
|
|
|
|
|
| |
Broken by "Split part of the Task struct into a separate struct
InCall".
|
|
|
|
|
| |
Fixes a bug reported by Lennart Augustsson, whereby we could get an
incorrect error from the RTS about re-entry from a finalizer,
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The idea is that this leaves Tasks and OSThread in one-to-one
correspondence. The part of a Task that represents a call into
Haskell from C is split into a separate struct InCall, pointed to by
the Task and the TSO bound to it. A given OSThread/Task thus always
uses the same mutex and condition variable, rather than getting a new
one for each callback. Conceptually it is simpler, although there are
more types and indirections in a few places now.
This improves callback performance by removing some of the locks that
we had to take when making in-calls. Now we also keep the current Task
in a thread-local variable if supported by the OS and gcc (currently
only Linux).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The first phase of this tidyup is focussed on the header files, and in
particular making sure we are exposinng publicly exactly what we need
to, and no more.
- Rts.h now includes everything that the RTS exposes publicly,
rather than a random subset of it.
- Most of the public header files have moved into subdirectories, and
many of them have been renamed. But clients should not need to
include any of the other headers directly, just #include the main
public headers: Rts.h, HsFFI.h, RtsAPI.h.
- All the headers needed for via-C compilation have moved into the
stg subdirectory, which is self-contained. Most of the headers for
the rest of the RTS APIs have moved into the rts subdirectory.
- I left MachDeps.h where it is, because it is so widely used in
Haskell code.
- I left a deprecated stub for RtsFlags.h in place. The flag
structures are now exposed by Rts.h.
- Various internal APIs are no longer exposed by public header files.
- Various bits of dead code and declarations have been removed
- More gcc warnings are turned on, and the RTS code is more
warning-clean.
- More source files #include "PosixSource.h", and hence only use
standard POSIX (1003.1c-1995) interfaces.
There is a lot more tidying up still to do, this is just the first
pass. I also intend to standardise the names for external RTS APIs
(e.g use the rts_ prefix consistently), and declare the internal APIs
as hidden for shared libraries.
|
| |
|
|
|
|
|
|
|
| |
There were races between workerTaskStop() and freeTaskManager(): we
need to be sure that all Tasks have exited properly before we start
tearing things down. This isn't completely straighforward, see
comments for details.
|
|
|
|
|
| |
This is just good practice to avoid placing two structures heavily
accessed by different CPUs on the same cache line
|
|
|
|
| |
avoids an assertion failure in newBoundTask()
|
| |
|
|
|
|
| |
See conc059.
|
| |
|
| |
|
|
|
|
| |
Also move closeMutex() etc. into freeTaskManager, this is a free-ish thing
|
|
|
|
|
|
|
|
|
| |
We were freeing the tasks in exitScheduler (stopTaskManager) before
exitStorage (stat_exit), but the latter needs to walk down the list
printing stats. Resulted in segfaults with commands like
ghc -v0 -e main q.hs -H32m -H32m +RTS -Sstderr
(where q.hs is trivial), but very sensitive to exact commandline and
libc version or something.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In preparation for parallel GC, split up the monolithic GC.c file into
smaller parts. Also in this patch (and difficult to separate,
unfortunatley):
- Don't include Stable.h in Rts.h, instead just include it where
necessary.
- consistently use STATIC_INLINE in source files, and INLINE_HEADER
in header files. STATIC_INLINE is now turned off when DEBUG is on,
to make debugging easier.
- The GC no longer takes the get_roots function as an argument.
We weren't making use of this generalisation.
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
See #805. This was here to catch bugs that resulted in an infinite
number of worker threads being created. However, we can't put a
reasonable bound on the number of worker threads, because legitimate
programs may need to create large numbers of (probably blocked) worker
threads. Furthermore, the OS probably has a bound on the number of
threads that a process can create in any case.
|
|
|
|
|
| |
Patch mostly from Lennart Augustsson in #803, with additions to
Task.c by me.
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
A simple interface for generating trace messages with timestamps and
thread IDs attached to them. Most debugging output goes through this
interface now, so it is straightforward to get timestamped debugging
traces with +RTS -vt. Also, we plan to use this to generate
parallelism profiles from the trace output.
|
|
|
|
|
| |
Previously we did this just for workers, now we do it for the main
thread and for forkOS threads too.
|
|
Most of the other users of the fptools build system have migrated to
Cabal, and with the move to darcs we can now flatten the source tree
without losing history, so here goes.
The main change is that the ghc/ subdir is gone, and most of what it
contained is now at the top level. The build system now makes no
pretense at being multi-project, it is just the GHC build system.
No doubt this will break many things, and there will be a period of
instability while we fix the dependencies. A straightforward build
should work, but I haven't yet fixed binary/source distributions.
Changes to the Building Guide will follow, too.
|