| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Most of the other users of the fptools build system have migrated to
Cabal, and with the move to darcs we can now flatten the source tree
without losing history, so here goes.
The main change is that the ghc/ subdir is gone, and most of what it
contained is now at the top level. The build system now makes no
pretense at being multi-project, it is just the GHC build system.
No doubt this will break many things, and there will be a period of
instability while we fix the dependencies. A straightforward build
should work, but I haven't yet fixed binary/source distributions.
Changes to the Building Guide will follow, too.
|
| | |
|
| | |
|
| | |
|
| |
|
|
|
| |
We shouldn't call closeCondition() on the condition in discardTask(),
we're just freeing the Task for later use.
|
| |
|
|
| |
use getThreadCPUTime, not getProcessTimes
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Improvments to time-measurement and stats:
- move all the platform-dependent timing related stuff into
posix/GetTime.c and win32/GetTime.c, with the machine-indepent
interface specified in GetTime.h. This is now used by
Stats.c.
- On Unix, use gettimeofday() and getrusage() by default, falling
back to time() if one of these isn't available.
- try to implement thread-specfic CPU-time measurement using
clock_gettime() on Unix. Doesn't work reliably on Linux, because
the implemenation tries to use the processor TSC, which on an
SMP machine goes wrong when the thread moves between CPUs. However,
it's slightly less bogus that before, and hopefully will improve
in the future.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Big re-hash of the threaded/SMP runtime
This is a significant reworking of the threaded and SMP parts of
the runtime. There are two overall goals here:
- To push down the scheduler lock, reducing contention and allowing
more parts of the system to run without locks. In particular,
the scheduler does not require a lock any more in the common case.
- To improve affinity, so that running Haskell threads stick to the
same OS threads as much as possible.
At this point we have the basic structure working, but there are some
pieces missing. I believe it's reasonably stable - the important
parts of the testsuite pass in all the (normal,threaded,SMP) ways.
In more detail:
- Each capability now has a run queue, instead of one global run
queue. The Capability and Task APIs have been completely
rewritten; see Capability.h and Task.h for the details.
- Each capability has its own pool of worker Tasks. Hence, Haskell
threads on a Capability's run queue will run on the same worker
Task(s). As long as the OS is doing something reasonable, this
should mean they usually stick to the same CPU. Another way to
look at this is that we're assuming each Capability is associated
with a fixed CPU.
- What used to be StgMainThread is now part of the Task structure.
Every OS thread in the runtime has an associated Task, and it
can ask for its current Task at any time with myTask().
- removed RTS_SUPPORTS_THREADS symbol, use THREADED_RTS instead
(it is now defined for SMP too).
- The RtsAPI has had to change; we must explicitly pass a Capability
around now. The previous interface assumed some global state.
SchedAPI has also changed a lot.
- The OSThreads API now supports thread-local storage, used to
implement myTask(), although it could be done more efficiently
using gcc's __thread extension when available.
- I've moved some POSIX-specific stuff into the posix subdirectory,
moving in the direction of separating out platform-specific
implementations.
- lots of lock-debugging and assertions in the runtime. In particular,
when DEBUG is on, we catch multiple ACQUIRE_LOCK()s, and there is
also an ASSERT_LOCK_HELD() call.
What's missing so far:
- I have almost certainly broken the Win32 build, will fix soon.
- any kind of thread migration or load balancing. This is high up
the agenda, though.
- various performance tweaks to do
- throwTo and forkProcess still do not work in SMP mode
|
| |
|
|
|
|
| |
changes to exitScheduler(): instead of waiting for all the tasks to
stop, which is unreasonable, we just wait for the run queue to drain.
This is much quicker, but not ideal (see comments).
|
| |
|
|
|
|
|
|
| |
expandTaskTable: we need to update the hash table too
(found by: Valgrind :-)
initTaskManager: take into account -N flag when sizing the initial
task table.
|
| |
|
|
|
|
|
| |
stopTaskManager: don't complain too loudly if we can't stop all the
tasks. The IO manager thread turns out to be an offender here;
perhaps we should start sending signals to threads if they don't stop
when they're told to.
|
| |
|
|
| |
- emit a debug message when we're yielding at shut down time
|
| |
|
|
|
|
| |
Revamp the Task API: now we use the same implementation for threaded
and SMP. We also keep per-task timing stats in the threaded RTS now,
which makes the output of +RTS -sstderr more useful.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some multi-processor hackery, including
- Don't hang blocked threads off BLACKHOLEs any more, instead keep
them all on a separate queue which is checked periodically for
threads to wake up.
This is good because (a) we don't have to worry about locking the
closure in SMP mode when we want to block on it, and (b) it means
the standard update code doesn't need to wake up any threads or
check for a BLACKHOLE_BQ, simplifying the update code.
The downside is that if there are lots of threads blocked on
BLACKHOLEs, we might have to do a lot of repeated list traversal.
We don't expect this to be common, though. conc023 goes slower
with this change, but we expect most programs to benefit from the
shorter update code.
- Fixing up the Capability code to handle multiple capabilities (SMP
mode), and related changes to get the SMP mode at least building.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Cleanup: all (well, most) messages from the RTS now go through the
functions in RtsUtils: barf(), debugBelch() and errorBelch(). The
latter two were previously called belch() and prog_belch()
respectively. See the comments for the right usage of these message
functions.
One reason for doing this is so that we can avoid spurious uses of
stdout/stderr by Haskell apps on platforms where we shouldn't be using
them (eg. non-console apps on Windows).
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Threaded RTS: Fix a deadlock situation
The flag startingWorkerThread that is used by startSchedulerTaskIfNecessary
(in Schedule.c) has to be reset if startTask (in Task.c) decides not to
start another task after all (if a task is already waiting).
When the flag isn't reset, this leads to a deadlock the next time a new
worker thread is actually needed.
MERGE TO STABLE
|
| |
|
|
| |
ANSIfy
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Threaded RTS:
Don't start new worker threads earlier than necessary.
After this commit, a Haskell program that uses neither forkOS nor forkIO is
really single-threaded (rather than using two OS threads internally).
Some details:
Worker threads are now only created when a capability is released, and
only when
(there are no worker threads)
&& (there are runnable Haskell threads ||
there are Haskell threads blocked on IO or threadDelay)
awaitEvent can now be called from bound thread scheduling loops
(so that we don't have to create a worker thread just to run awaitEvent)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit fixes many bugs and limitations in the threaded RTS.
There are still some issues remaining, though.
The following bugs should have been fixed:
- [+] "safe" calls could cause crashes
- [+] yieldToReturningWorker/grabReturnCapability
- It used to deadlock.
- [+] couldn't wake blocked workers
- Calls into the RTS could go unanswered for a long time, and
that includes ordinary callbacks in some circumstances.
- [+] couldn't block on an MVar and expect to be woken up by a signal
handler
- Depending on the exact situation, the RTS shut down or
blocked forever and ignored the signal.
- [+] The locking scheme in RtsAPI.c didn't work
- [+] run_thread label in wrong place (schedule())
- [+] Deadlock in GHC.Handle
- if a signal arrived at the wrong time, an mvar was never
filled again
- [+] Signals delivered to the "wrong" thread were ignored or handled
too late.
Issues:
*) If GC can move TSO objects (I don't know - can it?), then ghci
will occasionally crash when calling foreign functions, because the
parameters are stored on the TSO stack.
*) There is still a race condition lurking in the code
(both threaded and non-threaded RTS are affected):
If a signal arrives after the check for pending signals in
schedule(), but before the call to select() in awaitEvent(),
select() will be called anyway. The signal handler will be
executed much later than expected.
*) For Win32, GHC doesn't yet support non-blocking IO, so while a
thread is waiting for IO, no call-ins can happen. If the RTS is
blocked in awaitEvent, it uses a polling loop on Win32, so call-ins
should work (although the polling loop looks ugly).
*) Deadlock detection is disabled for the threaded rts, because I
don't know how to do it properly in the presence of foreign call-ins
from foreign threads.
This causes the tests conc031, conc033 and conc034 to fail.
*) "safe" is currently treated as "threadsafe". Implementing "safe" in
a way that blocks other Haskell threads is more difficult than was
thought at first. I think it could be done with a few additional lines
of code, but personally, I'm strongly in favour of abolishing the
distinction.
*) Running finalizers at program termination is inefficient - there
are two OS threads passing messages back and forth for every finalizer
that is run. Also (just as in the non-threaded case) the finalizers
are run in parallel to any remaining haskell threads and to any
foreign call-ins that might still happen.
|
| |
|
|
| |
stopTaskManager(): no seppuku, please.
|
| |
|
|
| |
wibble
|
| |
|
|
| |
removed taskNotAvailable(), taskAvailable() and getTaskCount() - simplified away
|
| |
|
|
|
|
|
|
|
|
| |
- in the threaded case, keep track of the number of
tasks/threads that are currently waiting to enter
the RTS.
- taskStart():
+ only start up a new thread/task if there aren't
any already waiting to gain RTS access.
+ honour thread/task limits (if any).
|
|
|
Factor out the task handling into separate 'module'.
[Tasks represent native threads that execute STG code, with this
module providing the API which the Scheduler uses to control
their creation and destruction.]
|