| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Most of the other users of the fptools build system have migrated to
Cabal, and with the move to darcs we can now flatten the source tree
without losing history, so here goes.
The main change is that the ghc/ subdir is gone, and most of what it
contained is now at the top level. The build system now makes no
pretense at being multi-project, it is just the GHC build system.
No doubt this will break many things, and there will be a period of
instability while we fix the dependencies. A straightforward build
should work, but I haven't yet fixed binary/source distributions.
Changes to the Building Guide will follow, too.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Improve the GC behaviour of IORefs (see Ticket #650).
This is a small change to the way IORefs interact with the GC, which
should improve GC performance for programs with plenty of IORefs.
Previously we had a single closure type for mutable variables,
MUT_VAR. Mutable variables were *always* on the mutable list in older
generations, and always traversed on every GC.
Now, we have two closure types: MUT_VAR_CLEAN and MUT_VAR_DIRTY. The
latter is on the mutable list, but the former is not. (NB. this
differs from MUT_ARR_PTRS_CLEAN and MUT_ARR_PTRS_DIRTY, both of which
are on the mutable list). writeMutVar# now implements a write
barrier, by calling dirty_MUT_VAR() in the runtime, that does the
necessary modification of MUT_VAR_CLEAN into MUT_VAR_DIRY, and adding
to the mutable list if necessary.
This results in some pretty dramatic speedups for GHC itself. I've
just measureed a 30% overall speedup compiling a 31-module program
(anna) with the default heap settings :-D
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Improve the GC behaviour of IOArrays/STArrays
See Ticket #650
This is a small change to the way mutable arrays interact with the GC,
that can have a dramatic effect on performance, and make tricks with
unsafeThaw/unsafeFreeze redundant. Data.HashTable should be faster
now (I haven't measured it yet).
We now have two mutable array closure types, MUT_ARR_PTRS_CLEAN and
MUT_ARR_PTRS_DIRTY. Both are on the mutable list if the array is in
an old generation. writeArray# sets the type to MUT_ARR_PTRS_DIRTY.
The garbage collector can set the type to MUT_ARR_PTRS_CLEAN if it
finds that no element of the array points into a younger generation
(discovering this required a small addition to evacuate(), but rough
tests indicate that it doesn't measurably affect performance).
NOTE: none of this affects unboxed arrays (IOUArray/STUArray), only
boxed arrays (IOArray/STArray).
We could go further and extend the DIRTY bit to be per-block rather
than for the whole array, but for now this is an easy improvement.
|
|
|
|
| |
Fix some problems with array thawing/freezing and the GC.
|
|
|
|
|
|
| |
Remove the ForeignObj# type, and all its PrimOps. The new efficient
representation of ForeignPtr doesn't use ForeignObj# underneath, and
there seems no need to keep it.
|
|
|
|
| |
Update to match ClosureTypes.h
|
|
|
|
| |
Support for atomic memory transactions and associated regression tests conc041-048
|
|
|
|
|
| |
Removed the annoying "Id" CVS keywords, they're a real PITA when it
comes to merging...
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Merge the eval-apply-branch on to the HEAD
------------------------------------------
This is a change to GHC's evaluation model in order to ultimately make
GHC more portable and to reduce complexity in some areas.
At some point we'll update the commentary to describe the new state of
the RTS. Pending that, the highlights of this change are:
- No more Su. The Su register is gone, update frames are one
word smaller.
- Slow-entry points and arg checks are gone. Unknown function calls
are handled by automatically-generated RTS entry points (AutoApply.hc,
generated by the program in utils/genapply).
- The stack layout is stricter: there are no "pending arguments" on
the stack any more, the stack is always strictly a sequence of
stack frames.
This means that there's no need for LOOKS_LIKE_GHC_INFO() or
LOOKS_LIKE_STATIC_CLOSURE() any more, and GHC doesn't need to know
how to find the boundary between the text and data segments (BIG WIN!).
- A couple of nasty hacks in the mangler caused by the neet to
identify closure ptrs vs. info tables have gone away.
- Info tables are a bit more complicated. See InfoTables.h for the
details.
- As a side effect, GHCi can now deal with polymorphic seq. Some bugs
in GHCi which affected primitives and unboxed tuples are now
fixed.
- Binary sizes are reduced by about 7% on x86. Performance is roughly
similar, some programs get faster while some get slower. I've seen
GHCi perform worse on some examples, but haven't investigated
further yet (GHCi performance *should* be about the same or better
in theory).
- Internally the code generator is rather better organised. I've moved
info-table generation from the NCG into the main codeGen where it is
shared with the C back-end; info tables are now emitted as arrays
of words in both back-ends. The NCG is one step closer to being able
to support profiling.
This has all been fairly thoroughly tested, but no doubt I've messed
up the commit in some way.
|
|
|
|
|
| |
Update this file to not use ISO C99 labelled initializers - this means
it will compile on MacOS/X.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change the story about POSIX headers in C compilation.
Until now, all C code in the RTS and library cbits has by default been
compiled with settings for POSIXness enabled, that is:
#define _POSIX_SOURCE 1
#define _POSIX_C_SOURCE 199309L
#define _ISOC9X_SOURCE
If you wanted to negate this, you'd have to define NON_POSIX_SOURCE
before including headers.
This scheme has some bad effects:
* It means that ccall-unfoldings exported via interfaces from a
module compiled with -DNON_POSIX_SOURCE may not compile when
imported into a module which does not -DNON_POSIX_SOURCE.
* It overlaps with the feature tests we do with autoconf.
* It seems to have caused borkage in the Solaris builds for some
considerable period of time.
The New Way is:
* The default changes to not-being-in-Posix mode.
* If you want to force a C file into Posix mode, #include as
the **first** include the new file ghc/includes/PosixSource.h.
Most of the RTS C sources have this include now.
* NON_POSIX_SOURCE is almost totally expunged. Unfortunately
we have to retain some vestiges of it in ghc/compiler so that
modules compiled via C on Solaris using older compilers don't
break.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a compacting garbage collector.
It isn't enabled by default, as there are still a couple of problems:
there's a fallback case I haven't implemented yet which means it will
occasionally bomb out, and speed-wise it's quite a bit slower than the
copying collector (about 1.8x slower).
Until I can make it go faster, it'll only be useful when you're
actually running low on real memory.
'+RTS -c' to enable it.
Oh, and I cleaned up a few things in the RTS while I was there, and
fixed one or two possibly real bugs in the existing GC.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
-*- outline -*-
Time-stamp: <Thu Mar 22 2001 03:50:16 Stardate: [-30]6365.79 hwloidl>
This commit covers changes in GHC to get GUM (way=mp) and GUM/GdH (way=md)
working. It is a merge of my working version of GUM, based on GHC 4.06,
with GHC 4.11. Almost all changes are in the RTS (see below).
GUM is reasonably stable, we used the 4.06 version in large-ish programs for
recent papers. Couple of things I want to change, but nothing urgent.
GUM/GdH has just been merged and needs more testing. Hope to do that in the
next weeks. It works in our working build but needs tweaking to run.
GranSim doesn't work yet (*sigh*). Most of the code should be in, but needs
more debugging.
ToDo: I still want to make the following minor modifications before the release
- Better wrapper skript for parallel execution [ghc/compiler/main]
- Update parallel docu: started on it but it's minimal [ghc/docs/users_guide]
- Clean up [nofib/parallel]: it's a real mess right now (*sigh*)
- Update visualisation tools (minor things only IIRC) [ghc/utils/parallel]
- Add a Klingon-English glossary
* RTS:
Almost all changes are restricted to ghc/rts/parallel and should not
interfere with the rest. I only comment on changes outside the parallel
dir:
- Several changes in Schedule.c (scheduling loop; createThreads etc);
should only affect parallel code
- Added ghc/rts/hooks/ShutdownEachPEHook.c
- ghc/rts/Linker.[ch]: GUM doesn't know about Stable Names (ifdefs)!!
- StgMiscClosures.h: END_TSO_QUEUE etc now defined here (from StgMiscClosures.hc)
END_ECAF_LIST was missing a leading stg_
- SchedAPI.h: taskStart now defined in here; it's only a wrapper around
scheduleThread now, but might use some init, shutdown later
- RtsAPI.h: I have nuked the def of rts_evalNothing
* Compiler:
- ghc/compiler/main/DriverState.hs
added PVM-ish flags to the parallel way
added new ways for parallel ticky profiling and distributed exec
- ghc/compiler/main/DriverPipeline.hs
added a fct run_phase_MoveBinary which is called with way=mp after linking;
it moves the bin file into a PVM dir and produces a wrapper script for
parallel execution
maybe cleaner to add a MoveBinary phase in DriverPhases.hs but this way
it's less intrusive and MoveBinary makes probably only sense for mp anyway
* Nofib:
- nofib/spectral/Makefile, nofib/real/Makefile, ghc/tests/programs/Makefile:
modified to skip some tests if HWL_NOFIB_HACK is set; only tmp to record
which test prgs cause problems in my working build right now
|
|
|
|
| |
Add a new closure flag, IND, to identify indirections.
|
|
|
|
|
|
| |
Remove the old Hugs CAF code, install our own (minimal, somewhat
cryptic, but better commented) CAF reversion story. See
Storage.c:newCaf() for the details.
|
|
|
|
|
| |
Merged GUM-4-04 branch into the main trunk. In particular merged GUM and
SMP code. Most of the GranSim code in GUM-4-04 still has to be carried over.
|
|
|
|
| |
mark INDirections as non-sparkable.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A slew of SMP-related changes.
- New locking scheme for thunks: we now check whether the thunk
being entered is in our private allocation area, and if so
we don't lock it. Well, that's the upshot. In practice it's
a lot more fiddly than that.
- I/O blocking is handled a bit more sanely now (but still not
properly, methinks)
- deadlock detection is back
- remove old pre-SMP scheduler code
- revamp the timing code. We actually get reasonable-looking
timing info for SMP programs now.
- fix a bug in the garbage collector to do with IND_OLDGENs appearing
on the mutable list of the old generation.
- move BDescr() function from rts/BlockAlloc.h to includes/Block.h.
- move struct generation and struct step into includes/StgStorage.h (sigh)
- add UPD_IND_NOLOCK for updating with an indirection where locking
the black hole is not required.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds in the current state of our SMP support. Notably,
this allows the new way 's' to be built, providing support for running
multiple Haskell threads simultaneously on top of any pthreads
implementation, the idea being to take advantage of commodity SMP
boxes.
Don't expect to get much of a speedup yet; due to the excessive
locking required to synchronise access to mutable heap objects, you'll
see a slowdown in most cases, even on a UP machine. The best I've
seen is a 1.6-1.7 speedup on an example that did no locking (two
optimised nfibs in parallel).
- new RTS -N flag specifies how many pthreads to start.
- new driver -smp flag, tells the driver to use way 's'.
- new compiler -fsmp option (not for user comsumption)
tells the compiler not to generate direct jumps to
thunk entry code.
- largely rewritten scheduler
- _ccall_GC is now done by handing back a "token" to the
RTS before executing the ccall; it should now be possible
to execute blocking ccalls in the current thread while
allowing the RTS to continue running Haskell threads as
normal.
- you can only call thread-safe C libraries from a way 's'
build, of course.
Pthread support is still incomplete, and weird things (including
deadlocks) are likely to happen.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(this is number 9 of 9 commits to be applied together)
Usage verification changes / ticky-ticky changes:
We want to verify that SingleEntry thunks are indeed entered at most
once. In order to do this, -ticky / -DTICKY_TICKY turns on eager
blackholing. We blackhole with new blackholes: SE_BLACKHOLE and
SE_CAF_BLACKHOLE. We will enter one of these if we attempt to enter
a SingleEntry thunk twice. Note that CAFs are dealt with in by
codeGen, and ordinary thunks by the RTS.
We also want to see how many times we enter each Updatable thunk.
To this end, we have modified -ticky. When -ticky is on, we update
with a permanent indirection, and arrange that when we enter a
permanent indirection we count the entry and then convert the
indirection to a normal indirection. This gives us a means of
counting the number of thunks entered again after the first entry.
Obviously this screws up profiling, and so you can't build a ticky
and profiling compiler any more.
Also a few other changes that didn't make it into the previous 8
commits, but form a part of this set.
|
|
Remove flags field from info tables; create a separate table of flags
indexed by the closure type in the RTS.
|