| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Most of the other users of the fptools build system have migrated to
Cabal, and with the move to darcs we can now flatten the source tree
without losing history, so here goes.
The main change is that the ghc/ subdir is gone, and most of what it
contained is now at the top level. The build system now makes no
pretense at being multi-project, it is just the GHC build system.
No doubt this will break many things, and there will be a period of
instability while we fix the dependencies. A straightforward build
should work, but I haven't yet fixed binary/source distributions.
Changes to the Building Guide will follow, too.
|
|
|
|
|
|
|
| |
Now, the threaded RTS also includes SMP support. The -smp flag is a
synonym for -threaded. The performance implications of this are small
to negligible, and it results in a code cleanup and reduces the number
of combinations we have to test.
|
|
|
|
|
|
|
| |
- fix a mixup in Capability.c regarding signals: signals_pending() is not
used in THREADED_RTS
- some cleanups and warning removal while I'm here
|
|
|
|
| |
oops, undo previous (SMP.h is already included)
|
|
|
|
| |
#include SMP.h
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Big re-hash of the threaded/SMP runtime
This is a significant reworking of the threaded and SMP parts of
the runtime. There are two overall goals here:
- To push down the scheduler lock, reducing contention and allowing
more parts of the system to run without locks. In particular,
the scheduler does not require a lock any more in the common case.
- To improve affinity, so that running Haskell threads stick to the
same OS threads as much as possible.
At this point we have the basic structure working, but there are some
pieces missing. I believe it's reasonably stable - the important
parts of the testsuite pass in all the (normal,threaded,SMP) ways.
In more detail:
- Each capability now has a run queue, instead of one global run
queue. The Capability and Task APIs have been completely
rewritten; see Capability.h and Task.h for the details.
- Each capability has its own pool of worker Tasks. Hence, Haskell
threads on a Capability's run queue will run on the same worker
Task(s). As long as the OS is doing something reasonable, this
should mean they usually stick to the same CPU. Another way to
look at this is that we're assuming each Capability is associated
with a fixed CPU.
- What used to be StgMainThread is now part of the Task structure.
Every OS thread in the runtime has an associated Task, and it
can ask for its current Task at any time with myTask().
- removed RTS_SUPPORTS_THREADS symbol, use THREADED_RTS instead
(it is now defined for SMP too).
- The RtsAPI has had to change; we must explicitly pass a Capability
around now. The previous interface assumed some global state.
SchedAPI has also changed a lot.
- The OSThreads API now supports thread-local storage, used to
implement myTask(), although it could be done more efficiently
using gcc's __thread extension when available.
- I've moved some POSIX-specific stuff into the posix subdirectory,
moving in the direction of separating out platform-specific
implementations.
- lots of lock-debugging and assertions in the runtime. In particular,
when DEBUG is on, we catch multiple ACQUIRE_LOCK()s, and there is
also an ASSERT_LOCK_HELD() call.
What's missing so far:
- I have almost certainly broken the Win32 build, will fix soon.
- any kind of thread migration or load balancing. This is high up
the agenda, though.
- various performance tweaks to do
- throwTo and forkProcess still do not work in SMP mode
|
|
|
|
| |
Move RTS_SUPPORTS_THREADS into RtsConfig.h
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Two SMP-related changes:
- New storage manager interface:
bdescr *allocateLocal(StgRegTable *reg, nat words)
which allocates from the current thread's nursery (being careful
not to clash with the heap pointer). It can do this without
taking any locks; the lock only has to be taken if a block needs
to be allocated. allocateLocal() is now used instead of allocate()
in a few PrimOps.
This removes locks from most Integer operations, cutting down
the overhead for SMP a bit more.
To make this work, we have to be able to grab the current thread's
Capability out of thin air (i.e. when called from GMP), so the
Capability subsystem needs to keep a hash from thread IDs to
Capabilities.
- Small MVar optimisation: instead of taking the global
storage-manager lock, do our own locking of MVars with a bit of
inline assembly (x86 only for now).
|
|
|
|
|
| |
assertion failures should go through the RtsMessages layer, so they
get a pop-up box in a Windows app.
|
|
|
|
|
| |
Add __attribute__((used)) to static functions, as gcc 3.4 -O2 is in the
habit of throwing them away.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rationalise the BUILD,HOST,TARGET defines.
Recall that:
- build is the platform we're building on
- host is the platform we're running on
- target is the platform we're generating code for
The change is that now we take these definitions as applying from the
point of view of the particular source code being built, rather than
the point of view of the whole build tree.
For example, in RTS and library code, we were previously testing the
TARGET platform. But under the new rule, the platform on which this
code is going to run is the HOST platform. TARGET only makes sense in
the compiler sources.
In practical terms, this means that the values of BUILD, HOST & TARGET
may vary depending on which part of the build tree we are in.
Actual changes:
- new file: includes/ghcplatform.h contains platform defines for
the RTS and library code.
- new file: includes/ghcautoconf.h contains the autoconf settings
only (HAVE_BLAH). This is so that we can get hold of these
settings independently of the platform defines when necessary
(eg. in GHC).
- ghcconfig.h now #includes both ghcplatform.h and ghcautoconf.h.
- MachRegs.h, which is included into both the compiler and the RTS,
now has to cope with the fact that it might need to test either
_TARGET_ or _HOST_ depending on the context.
- the compiler's Makefile now generates
stage{1,2,3}/ghc_boot_platform.h
which contains platform defines for the compiler. These differ
depending on the stage, of course: in stage2, the HOST is the
TARGET of stage1. This was wrong before.
- The compiler doesn't get platform info from Config.hs any more.
Previously it did (sometimes), but unless we want to generate
a new Config.hs for each stage we can't do this.
- GHC now helpfully defines *_{BUILD,HOST}_{OS,ARCH} automatically
in CPP'd Haskell source.
- ghcplatform.h defines *_TARGET_* for backwards compatibility
(ghcplatform.h is included by ghcconfig.h, which is included by
config.h, so code which still #includes config.h will get the TARGET
settings as before).
- The Users's Guide is updated to mention *_HOST_* rather than
*_TARGET_*.
- coding-style.html in the commentary now contains a section on
platform defines. There are further doc updates to come.
Thanks to Wolfgang Thaller for pointing me in the right direction.
|
|
|
|
| |
drop win32 protos, current windows.h's now provide 'em.
|
|
|
|
|
| |
Get rid of SUPPORTS_EMPTY_STRUCTS, and just avoid using empty struct
definitions.
|
|
|
|
| |
some cleanups
|
|
|
|
|
|
| |
Further to the RTS messaging tidyup: export the new message API and
hooks via RtsMessages.h, so that a client program can easily redirect
messages.
|
|
|
|
|
| |
Terminate program if execPage fails, this is more honest and
simplifies things a bit.
|
|
|
|
|
| |
Moved createAdjustor and freeHaskellFunctionPtr to a header visible in
*.hc code. The whole header layout is a little bit baroque, IMHO...
|
|
|
|
| |
Merge backend-hacking-branch onto HEAD. Yay!
|
|
|
|
|
|
|
|
|
|
|
|
| |
Tidy up a couple of unportable coding issues:
- conditionally use empty structs.
- use GNU attributes only if supported.
- 'long long' usage
- use of 'inline' in declarations and definitions.
Upshot of these changes is that MSVC is now capable of compiling
the non-.hc portions of the RTS.
|
|
|
|
| |
Fixed #ifdefery for GCC >= 3.x
|
|
|
|
|
|
| |
Sigh, I thought I could keep this file private to the RTS, but sadly
it's needed in order to #include RtsFlags.h, and we advertise
RtsFlags.h as a way to tweak flags through defaultsHook(). Oh well.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Merge the eval-apply-branch on to the HEAD
------------------------------------------
This is a change to GHC's evaluation model in order to ultimately make
GHC more portable and to reduce complexity in some areas.
At some point we'll update the commentary to describe the new state of
the RTS. Pending that, the highlights of this change are:
- No more Su. The Su register is gone, update frames are one
word smaller.
- Slow-entry points and arg checks are gone. Unknown function calls
are handled by automatically-generated RTS entry points (AutoApply.hc,
generated by the program in utils/genapply).
- The stack layout is stricter: there are no "pending arguments" on
the stack any more, the stack is always strictly a sequence of
stack frames.
This means that there's no need for LOOKS_LIKE_GHC_INFO() or
LOOKS_LIKE_STATIC_CLOSURE() any more, and GHC doesn't need to know
how to find the boundary between the text and data segments (BIG WIN!).
- A couple of nasty hacks in the mangler caused by the neet to
identify closure ptrs vs. info tables have gone away.
- Info tables are a bit more complicated. See InfoTables.h for the
details.
- As a side effect, GHCi can now deal with polymorphic seq. Some bugs
in GHCi which affected primitives and unboxed tuples are now
fixed.
- Binary sizes are reduced by about 7% on x86. Performance is roughly
similar, some programs get faster while some get slower. I've seen
GHCi perform worse on some examples, but haven't investigated
further yet (GHCi performance *should* be about the same or better
in theory).
- Internally the code generator is rather better organised. I've moved
info-table generation from the NCG into the main codeGen where it is
shared with the C back-end; info tables are now emitted as arrays
of words in both back-ends. The NCG is one step closer to being able
to support profiling.
This has all been fairly thoroughly tested, but no doubt I've messed
up the commit in some way.
|
|
|
|
| |
EXIT_{FAILURE,SUCCESS}: it's time to let the re-definition of these age-old defines go..
|
|
|
|
| |
as per simonpj request, add mingw protos to avoid -Wmissing-declarations warnings
|
|
|
|
| |
moved defn of RTS_SUPPORTS_THREADS from Rts.h to Stg.h
|
|
|
|
| |
new define, RTS_SUPPORTS_THREADS - defined in SMP and 'threaded' modes of operation
|
|
|
|
| |
extend the scope of the doNothing() macro; can now be used in Stg headers
|
|
|
|
|
| |
Wrap the include file entry-points in extern "C" { ... } if this is a
C++ compiler.
|
|
|
|
|
|
| |
Changed a bunch of `#endif FOO' to `#endif /* FOO */', the former is
not strictly ANSI (don't know if the latter is, but `gcc -Wall -ansi
-pedantic' is silent then).
|
|
|
|
|
| |
Merged GUM-4-04 branch into the main trunk. In particular merged GUM and
SMP code. Most of the GranSim code in GUM-4-04 still has to be carried over.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- remove AllBlocked scheduler return code. Nobody owned up to having
created it or even knowing what it was there for.
- clean up fatal error condition handling somewhat. The process
exit code from a GHC program now indicates the kind of failure
for certain kinds of exit:
general internal RTS error 254
program deadlocked 253
program interrupted (ctrl-C) 252
heap overflow 251
main thread killed 250
and we leave exit codes 1-199 for the user (as is traditional at MS,
200-249 are reserved for future expansion, and may contain
undocumented extensions :-)
|
|
|
|
| |
Add 'par' and sparking support to the SMP implementation.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A slew of SMP-related changes.
- New locking scheme for thunks: we now check whether the thunk
being entered is in our private allocation area, and if so
we don't lock it. Well, that's the upshot. In practice it's
a lot more fiddly than that.
- I/O blocking is handled a bit more sanely now (but still not
properly, methinks)
- deadlock detection is back
- remove old pre-SMP scheduler code
- revamp the timing code. We actually get reasonable-looking
timing info for SMP programs now.
- fix a bug in the garbage collector to do with IND_OLDGENs appearing
on the mutable list of the old generation.
- move BDescr() function from rts/BlockAlloc.h to includes/Block.h.
- move struct generation and struct step into includes/StgStorage.h (sigh)
- add UPD_IND_NOLOCK for updating with an indirection where locking
the black hole is not required.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds in the current state of our SMP support. Notably,
this allows the new way 's' to be built, providing support for running
multiple Haskell threads simultaneously on top of any pthreads
implementation, the idea being to take advantage of commodity SMP
boxes.
Don't expect to get much of a speedup yet; due to the excessive
locking required to synchronise access to mutable heap objects, you'll
see a slowdown in most cases, even on a UP machine. The best I've
seen is a 1.6-1.7 speedup on an example that did no locking (two
optimised nfibs in parallel).
- new RTS -N flag specifies how many pthreads to start.
- new driver -smp flag, tells the driver to use way 's'.
- new compiler -fsmp option (not for user comsumption)
tells the compiler not to generate direct jumps to
thunk entry code.
- largely rewritten scheduler
- _ccall_GC is now done by handing back a "token" to the
RTS before executing the ccall; it should now be possible
to execute blocking ccalls in the current thread while
allowing the RTS to continue running Haskell threads as
normal.
- you can only call thread-safe C libraries from a way 's'
build, of course.
Pthread support is still incomplete, and weird things (including
deadlocks) are likely to happen.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Support for thread{WaitRead,WaitWrite,Delay}. These should behave
identically to the 3.02 implementations.
We now have the virtual timer on during all program runs, which ticks
at 50Hz by default. This is used to implement threadDelay, so you
won't get any better granularity than the tick frequency
unfortunately. It remains to be seen whether using the virtual timer
will have a measurable impact on performance for non-threadDelaying
programs.
All operations in the I/O subsystem should now be non-blocking with
respect to other running Haskell threads. It remains to be seen
whether this will have a measurable performance impact on
non-concurrent programs (probably not).
|
|
|
|
| |
Copyright police.
|
|
|
|
|
|
|
|
| |
- Add Stable Names
- Stable pointers and stable names are now both provided by the
"Stable" module in ghc/lib/exts. Documentation is updated, and Foriegn
still exports the stable pointer operations for backwards compatibility.
|
|
|
|
|
| |
Resurrect ticky-ticky profiling. Not quite polished yet, but it
compiles and produces some reasonable-looking stats.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added a generational garbage collector.
The collector is reliable but fairly untuned as yet. It works with an
arbitrary number of generations: use +RTS -G<gens> to change the
number of generations used (default 2).
Stats: +RTS -Sstderr is quite useful, but to really see what's going
on compile the RTS with -DDEBUG and use +RTS -D32.
ARR_PTRS removed - it wasn't used anywhere.
Sanity checking improved:
- free blocks are now spammed when sanity checking is turned on
- a check for leaking blocks is performed after each GC.
|
|
Move 4.01 onto the main trunk.
|