| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Most of the other users of the fptools build system have migrated to
Cabal, and with the move to darcs we can now flatten the source tree
without losing history, so here goes.
The main change is that the ghc/ subdir is gone, and most of what it
contained is now at the top level. The build system now makes no
pretense at being multi-project, it is just the GHC build system.
No doubt this will break many things, and there will be a period of
instability while we fix the dependencies. A straightforward build
should work, but I haven't yet fixed binary/source distributions.
Changes to the Building Guide will follow, too.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Big re-hash of the threaded/SMP runtime
This is a significant reworking of the threaded and SMP parts of
the runtime. There are two overall goals here:
- To push down the scheduler lock, reducing contention and allowing
more parts of the system to run without locks. In particular,
the scheduler does not require a lock any more in the common case.
- To improve affinity, so that running Haskell threads stick to the
same OS threads as much as possible.
At this point we have the basic structure working, but there are some
pieces missing. I believe it's reasonably stable - the important
parts of the testsuite pass in all the (normal,threaded,SMP) ways.
In more detail:
- Each capability now has a run queue, instead of one global run
queue. The Capability and Task APIs have been completely
rewritten; see Capability.h and Task.h for the details.
- Each capability has its own pool of worker Tasks. Hence, Haskell
threads on a Capability's run queue will run on the same worker
Task(s). As long as the OS is doing something reasonable, this
should mean they usually stick to the same CPU. Another way to
look at this is that we're assuming each Capability is associated
with a fixed CPU.
- What used to be StgMainThread is now part of the Task structure.
Every OS thread in the runtime has an associated Task, and it
can ask for its current Task at any time with myTask().
- removed RTS_SUPPORTS_THREADS symbol, use THREADED_RTS instead
(it is now defined for SMP too).
- The RtsAPI has had to change; we must explicitly pass a Capability
around now. The previous interface assumed some global state.
SchedAPI has also changed a lot.
- The OSThreads API now supports thread-local storage, used to
implement myTask(), although it could be done more efficiently
using gcc's __thread extension when available.
- I've moved some POSIX-specific stuff into the posix subdirectory,
moving in the direction of separating out platform-specific
implementations.
- lots of lock-debugging and assertions in the runtime. In particular,
when DEBUG is on, we catch multiple ACQUIRE_LOCK()s, and there is
also an ASSERT_LOCK_HELD() call.
What's missing so far:
- I have almost certainly broken the Win32 build, will fix soon.
- any kind of thread migration or load balancing. This is high up
the agenda, though.
- various performance tweaks to do
- throwTo and forkProcess still do not work in SMP mode
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some multi-processor hackery, including
- Don't hang blocked threads off BLACKHOLEs any more, instead keep
them all on a separate queue which is checked periodically for
threads to wake up.
This is good because (a) we don't have to worry about locking the
closure in SMP mode when we want to block on it, and (b) it means
the standard update code doesn't need to wake up any threads or
check for a BLACKHOLE_BQ, simplifying the update code.
The downside is that if there are lots of threads blocked on
BLACKHOLEs, we might have to do a lot of repeated list traversal.
We don't expect this to be common, though. conc023 goes slower
with this change, but we expect most programs to benefit from the
shorter update code.
- Fixing up the Capability code to handle multiple capabilities (SMP
mode), and related changes to get the SMP mode at least building.
|
|
|
|
|
|
|
|
|
|
| |
* Some preprocessors don't like the C99/C++ '//' comments after a
directive, so use '/* */' instead. For consistency, a lot of '//' in
the include files were converted, too.
* UnDOSified libraries/base/cbits/runProcess.c.
* My favourite sport: Killed $Id$s.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Tidy up a couple of unportable coding issues:
- conditionally use empty structs.
- use GNU attributes only if supported.
- 'long long' usage
- use of 'inline' in declarations and definitions.
Upshot of these changes is that MSVC is now capable of compiling
the non-.hc portions of the RTS.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bound Threads
=============
Introduce a way to use foreign libraries that rely on thread local state
from multiple threads (mainly affects the threaded RTS).
See the file threads.tex in CVS at haskell-report/ffi/threads.tex
(not entirely finished yet) for a definition of this extension. A less formal
description is also found in the documentation of Control.Concurrent.
The changes mostly affect the THREADED_RTS (./configure --enable-threaded-rts),
except for saving & restoring errno on a per-TSO basis, which is also necessary
for the non-threaded RTS (a bugfix).
Detailed list of changes
------------------------
- errno is saved in the TSO object and restored when necessary:
ghc/includes/TSO.h, ghc/rts/Interpreter.c, ghc/rts/Schedule.c
- rts_mainLazyIO is no longer needed, main is no special case anymore
ghc/includes/RtsAPI.h, ghc/rts/RtsAPI.c, ghc/rts/Main.c, ghc/rts/Weak.c
- passCapability: a new function that releases the capability and "passes"
it to a specific OS thread:
ghc/rts/Capability.h ghc/rts/Capability.c
- waitThread(), scheduleWaitThread() and schedule() get an optional
Capability *initialCapability passed as an argument:
ghc/includes/SchedAPI.h, ghc/rts/Schedule.c, ghc/rts/RtsAPI.c
- Bound Thread scheduling (that's what this is all about):
ghc/rts/Schedule.h, ghc/rts/Schedule.c
- new Primop isCurrentThreadBound#:
ghc/compiler/prelude/primops.txt.pp, ghc/includes/PrimOps.h, ghc/rts/PrimOps.hc,
ghc/rts/Schedule.h, ghc/rts/Schedule.c
- a simple function, rtsSupportsBoundThreads, that returns true if THREADED_RTS
is defined:
ghc/rts/Schedule.h, ghc/rts/Schedule.c
- a new implementation of forkProcess (the old implementation stays in place
for the non-threaded case). Partially broken; works for the standard
fork-and-exec case, but not for much else. A proper forkProcess is
really next to impossible to implement:
ghc/rts/Schedule.c
- Library support for bound threads:
Control.Concurrent.
rtsSupportsBoundThreads, isCurrentThreadBound, forkOS,
runInBoundThread, runInUnboundThread
libraries/base/Control/Concurrent.hs, libraries/base/Makefile,
libraries/base/include/HsBase.h, libraries/base/cbits/forkOS.c (new file)
|
|
|
|
|
|
| |
Nuked prototype for non-existent function 'RevertCAFs'. Even if this
should be the same as ghc/rts/Storage.h's 'revertCAFs', it would still
be a strange place...
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Merge the eval-apply-branch on to the HEAD
------------------------------------------
This is a change to GHC's evaluation model in order to ultimately make
GHC more portable and to reduce complexity in some areas.
At some point we'll update the commentary to describe the new state of
the RTS. Pending that, the highlights of this change are:
- No more Su. The Su register is gone, update frames are one
word smaller.
- Slow-entry points and arg checks are gone. Unknown function calls
are handled by automatically-generated RTS entry points (AutoApply.hc,
generated by the program in utils/genapply).
- The stack layout is stricter: there are no "pending arguments" on
the stack any more, the stack is always strictly a sequence of
stack frames.
This means that there's no need for LOOKS_LIKE_GHC_INFO() or
LOOKS_LIKE_STATIC_CLOSURE() any more, and GHC doesn't need to know
how to find the boundary between the text and data segments (BIG WIN!).
- A couple of nasty hacks in the mangler caused by the neet to
identify closure ptrs vs. info tables have gone away.
- Info tables are a bit more complicated. See InfoTables.h for the
details.
- As a side effect, GHCi can now deal with polymorphic seq. Some bugs
in GHCi which affected primitives and unboxed tuples are now
fixed.
- Binary sizes are reduced by about 7% on x86. Performance is roughly
similar, some programs get faster while some get slower. I've seen
GHCi perform worse on some examples, but haven't investigated
further yet (GHCi performance *should* be about the same or better
in theory).
- Internally the code generator is rather better organised. I've moved
info-table generation from the NCG into the main codeGen where it is
shared with the C back-end; info tables are now emitted as arrays
of words in both back-ends. The NCG is one step closer to being able
to support profiling.
This has all been fairly thoroughly tested, but no doubt I've messed
up the commit in some way.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When handling external call-ins (via the RTS API) in
the multi-threaded case, add the StgMainThread that
the external thread is going to block waiting on
to the main_threads list prior to scheduling the new
worker thread.
Do this by having the scheduler provide a new entry
point, scheduleWaitThread().
Fixes a bug/race condition spotted by Wolfgang Thaller
(see scheduleWaitThread() comment) + enables a little
tidier interface between RtsAPI and Schedule.
|
|
|
|
|
|
|
|
|
|
|
| |
Distinguish between the scheduling of a new thread from within
the RTS (e.g., via forkIO, running finalizers etc) and scheduling
of a thread that's created via the RtsAPI -- the latter
now uses scheduleExtThread(), the rest scheduleThread().
Why the distinction? Because the former will in threaded builds create
a worker OS thread, while the latter won't. (There's an added
wrinkle -- main() will also use scheduleThread()).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
-*- outline -*-
Time-stamp: <Thu Mar 22 2001 03:50:16 Stardate: [-30]6365.79 hwloidl>
This commit covers changes in GHC to get GUM (way=mp) and GUM/GdH (way=md)
working. It is a merge of my working version of GUM, based on GHC 4.06,
with GHC 4.11. Almost all changes are in the RTS (see below).
GUM is reasonably stable, we used the 4.06 version in large-ish programs for
recent papers. Couple of things I want to change, but nothing urgent.
GUM/GdH has just been merged and needs more testing. Hope to do that in the
next weeks. It works in our working build but needs tweaking to run.
GranSim doesn't work yet (*sigh*). Most of the code should be in, but needs
more debugging.
ToDo: I still want to make the following minor modifications before the release
- Better wrapper skript for parallel execution [ghc/compiler/main]
- Update parallel docu: started on it but it's minimal [ghc/docs/users_guide]
- Clean up [nofib/parallel]: it's a real mess right now (*sigh*)
- Update visualisation tools (minor things only IIRC) [ghc/utils/parallel]
- Add a Klingon-English glossary
* RTS:
Almost all changes are restricted to ghc/rts/parallel and should not
interfere with the rest. I only comment on changes outside the parallel
dir:
- Several changes in Schedule.c (scheduling loop; createThreads etc);
should only affect parallel code
- Added ghc/rts/hooks/ShutdownEachPEHook.c
- ghc/rts/Linker.[ch]: GUM doesn't know about Stable Names (ifdefs)!!
- StgMiscClosures.h: END_TSO_QUEUE etc now defined here (from StgMiscClosures.hc)
END_ECAF_LIST was missing a leading stg_
- SchedAPI.h: taskStart now defined in here; it's only a wrapper around
scheduleThread now, but might use some init, shutdown later
- RtsAPI.h: I have nuked the def of rts_evalNothing
* Compiler:
- ghc/compiler/main/DriverState.hs
added PVM-ish flags to the parallel way
added new ways for parallel ticky profiling and distributed exec
- ghc/compiler/main/DriverPipeline.hs
added a fct run_phase_MoveBinary which is called with way=mp after linking;
it moves the bin file into a PVM dir and produces a wrapper script for
parallel execution
maybe cleaner to add a MoveBinary phase in DriverPhases.hs but this way
it's less intrusive and MoveBinary makes probably only sense for mp anyway
* Nofib:
- nofib/spectral/Makefile, nofib/real/Makefile, ghc/tests/programs/Makefile:
modified to skip some tests if HWL_NOFIB_HACK is set; only tmp to record
which test prgs cause problems in my working build right now
|
|
|
|
| |
merge recent changes from before-ghci-branch onto the HEAD
|
|
|
|
|
|
| |
Change the names of several RTS symbols so they don't potentially
clash with user code. All the symbols that may clash now have an
"stg_" prefix.
|
|
|
|
|
|
| |
Clean up the runtime heap before deleting modules (and, currently, after
every evaluation) so that the combined system can safely throw away
modules and info tables without creating dangling refs from the heap.
|
|
|
|
|
| |
Merged GUM-4-04 branch into the main trunk. In particular merged GUM and
SMP code. Most of the GranSim code in GUM-4-04 still has to be carried over.
|
|
|
|
|
|
| |
In hugs, implement ThreadId(..), instance Eq/Ord ThreadId,
and forkIO. Add deleteAllThreads() to scheduler so Hugs can
clean up after evaluation.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds in the current state of our SMP support. Notably,
this allows the new way 's' to be built, providing support for running
multiple Haskell threads simultaneously on top of any pthreads
implementation, the idea being to take advantage of commodity SMP
boxes.
Don't expect to get much of a speedup yet; due to the excessive
locking required to synchronise access to mutable heap objects, you'll
see a slowdown in most cases, even on a UP machine. The best I've
seen is a 1.6-1.7 speedup on an example that did no locking (two
optimised nfibs in parallel).
- new RTS -N flag specifies how many pthreads to start.
- new driver -smp flag, tells the driver to use way 's'.
- new compiler -fsmp option (not for user comsumption)
tells the compiler not to generate direct jumps to
thunk entry code.
- largely rewritten scheduler
- _ccall_GC is now done by handing back a "token" to the
RTS before executing the ccall; it should now be possible
to execute blocking ccalls in the current thread while
allowing the RTS to continue running Haskell threads as
normal.
- you can only call thread-safe C libraries from a way 's'
build, of course.
Pthread support is still incomplete, and weird things (including
deadlocks) are likely to happen.
|
|
|
|
|
|
| |
Redo previous commit to cut down on the use of COMPILING_RTS where
possible - SchedAPI.h is now an RTS internal header file which
RtsAPI.h no longer includes.
|
|
|
|
|
|
| |
New RTS entry point, shutdownHaskellAndExit(), which does what the name
implies - used when you want to exit from within Haskell code (e.g.,
System.exitWith.)
|
|
|
|
| |
suppress needless warning
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Made rts_evalIO() stricter, i.e.,
rts_evalIO( action );
will now essentially cause `action' to be applied
to the following (imaginary) defn of `evalIO':
evalIO :: IO a -> IO a
evalIO action = action >>= \ x -> x `seq` return x
instead of just
evalIO :: IO a -> IO a
evalIO action = action >>= \ x -> return x
The old, lazier behaviour is now available via rts_evalLazyIO().
|
|
Move 4.01 onto the main trunk.
|