| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds in the current state of our SMP support. Notably,
this allows the new way 's' to be built, providing support for running
multiple Haskell threads simultaneously on top of any pthreads
implementation, the idea being to take advantage of commodity SMP
boxes.
Don't expect to get much of a speedup yet; due to the excessive
locking required to synchronise access to mutable heap objects, you'll
see a slowdown in most cases, even on a UP machine. The best I've
seen is a 1.6-1.7 speedup on an example that did no locking (two
optimised nfibs in parallel).
- new RTS -N flag specifies how many pthreads to start.
- new driver -smp flag, tells the driver to use way 's'.
- new compiler -fsmp option (not for user comsumption)
tells the compiler not to generate direct jumps to
thunk entry code.
- largely rewritten scheduler
- _ccall_GC is now done by handing back a "token" to the
RTS before executing the ccall; it should now be possible
to execute blocking ccalls in the current thread while
allowing the RTS to continue running Haskell threads as
normal.
- you can only call thread-safe C libraries from a way 's'
build, of course.
Pthread support is still incomplete, and weird things (including
deadlocks) are likely to happen.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A regrettably-gigantic commit that puts in place what Simon PJ
has been up to for the last month or so, on and off.
The basic idea was to restore unfoldings to *occurrences* of
variables without introducing a space leak. I wanted to make
sure things improved relative to 4.04, and that proved depressingly
hard. On the way I discovered several quite serious bugs in the
simplifier.
Here's a summary of what's gone on.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
* No commas between for-alls in RULES. This makes the for-alls have
the same syntax as in types.
* Arrange that simplConArgs works in one less pass than before.
This exposed a bug: a bogus call to completeBeta.
* Add a top-level flag in CoreUnfolding, used in callSiteInline
* Extend w/w to use etaExpandArity, so it does eta/coerce expansion
* Implement inline phases. The meaning of the inline pragmas is
described in CoreUnfold.lhs. You can say things like
{#- INLINE 2 build #-}
to mean "inline build in phase 2"
* Don't float anything out of an INLINE.
Don't float things to top level unless they also escape a value lambda.
[see comments with SetLevels.lvlMFE
Without at least one of these changes, I found that
{-# INLINE concat #-}
concat = __inline (/\a -> foldr (++) [])
was getting floated to
concat = __inline( /\a -> lvl a )
lvl = ...inlined version of foldr...
Subsequently I found that not floating constants out of an INLINE
gave really bad code like
__inline (let x = e in \y -> ...)
so I now let things float out of INLINE
* Implement the "reverse-mapping" idea for CSE; actually it turned out to be easier
to implement it in SetLevels, and may benefit full laziness too.
* It's a good idea to inline inRange. Consider
index (l,h) i = case inRange (l,h) i of
True -> l+i
False -> error
inRange itself isn't strict in h, but if it't inlined then 'index'
*does* become strict in h. Interesting!
* Big change to the way unfoldings and occurrence info is propagated in the simplifier
The plan is described in Subst.lhs with the Subst type
Occurrence info is now in a separate IdInfo field than user pragmas
* I found that
(coerce T (coerce S (\x.e))) y
didn't simplify in one round. First we get to
(\x.e) y
and only then do the beta. Solution: cancel the coerces in the continuation
* Amazingly, CoreUnfold wasn't counting the cost of a function an application.
* Disable rules in initial simplifier run. Otherwise full laziness
doesn't get a chance to lift out a MFE before a rule (e.g. fusion)
zaps it. queens is a case in point
* Improve float-out stuff significantly. The big change is that if we have
\x -> ... /\a -> ...let p = ..a.. in let q = ...p...
where p's rhs doesn't x, we abstract a from p, so that we can get p past x.
(We did that before.) But we also substitute (p a) for p in q, and then
we can do the same thing for q. (We didn't do that, so q got stuck.)
This is much better. It involves doing a substitution "as we go" in SetLevels,
though.
|
| |
|
|
|
|
|
| |
FFI wibble:
* disallow the use of {Mutable}ByteArrays in 'safe' foreign imports.
* ensure that ForeignObjs live across a _ccall_GC_.
|
| |
|
|
|
|
|
| |
Crude allocation-counting extension to ticky-ticky profiling.
Allocations are counted against the closest lexically enclosing
function closure, so you need to map the output back to the STG code.
|
| |
|
|
|
|
|
|
| |
A couple of fixes and cleanups to ticky-ticky profiling:
- remove UPD_EXISTING (doesn't make sense)
- add UPD_CON_IN_PLACE, now that we have in-place updates
- clean up the output a little.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Main things:
* Add splitProductType_maybe to DataCon.lhs, with type
splitProductType_maybe
:: Type -- A product type, perhaps
-> Maybe (TyCon, -- The type constructor
[Type], -- Type args of the tycon
DataCon, -- The data constructor
[Type]) -- Its *representation* arg types
Then use it in many places (e.g. worker-wrapper places) instead
of a pile of junk
* Clean up various uses of dataConArgTys, which were plain wrong because
they weren't passed the existential type arguments. Most of these calls
are eliminated by using splitProductType_maybe above. I hope I correctly
squashed the others. This fixes a bug that Meurig's programs showed up.
module FailGHC (killSustainer) where
import Weak
import IOExts
data Sustainer = forall a . Sustainer (IORef (Maybe a)) (IO ())
killSustainer :: Sustainer -> IO ()
killSustainer (Sustainer _ act) = act
The above program used to kill the compiler.
* A fairly concerted attack on the Dreaded Space Leak.
- Add Type.seqType, CoreSyn.seqExpr, CoreSyn.seqRules
- Add some seq'ing when building Ids and IdInfos
These reduce the space usage a lot
- Add CoreSyn.coreBindsSize, which is pretty strict in the program,
and call it when we have -dshow-passes.
- Do not put the inlining in an Id that is being plugged into
the result-expression of the simplifier. This cures
a the 'wedge' in the space profile for reasons I don't understand fully
Together, these things reduce the max space usage when compiling PrelNum from
17M to about 7Mbytes.
I think there are now *too many* seqs, and they waste work, but I don't have
time to find which ones.
Furthermore, we aren't done. For some reason, some of the stuff allocated by
the simplifier makes it through all during code generation and I don't see why.
There's a should-be-unnecessary call to coreBindsSize in Main.main which
zaps some, but not all of this space.
-dshow-passes reduces space usage a bit, but I don't think it should really.
All the measurements were made on a compiler compiled with profiling by
GHC 3.03. I hope they carry over to other builds!
* One trivial thing: changed all variables 'label' to 'lbl', becuase the
former is a keyword with -fglagow-exts in GHC 3.03 (which I was compiling with).
Something similar in StringBuffer.
|
| |
|
|
| |
Update to match CgUsages.hi-boot-5
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add Type.repType
* Re-express splitRepTyConApp_maybe using repType
* Use the new repType in Core2Stg
The bug was that we ended up with a binding like
let x = /\a -> 3# +# y
in ...
and this should turn into an STG case, but the big lambda
fooled the core-to-STG pass
|
| |
|
|
|
| |
Jump to the join point when returning a new constructor to a bind
default. Fixes: recent panic in mkStaticAlgReturnCode.
|
| |
|
|
|
|
| |
- Implement update-in-place in certain very specialised circumstances
- Clean up abstract C a bit
- Speed up pretty-printing absC a bit.
|
| |
|
|
| |
Many small tuning changes
|
| |
|
|
|
| |
Move some code around to reduce the linkage between CgMonad and CgBindery,
and make the .hi-boot-5 file compatible with both 4.02 and 4.03.
|
| |
|
|
| |
Remove debugging trace that sneaked in.
|
| |
|
|
|
| |
Use ClosureInfo.hi-boot instead of ClosureInfo.hi (which might not be
built yet).
|
| |
|
|
| |
Update the comment for buildLivenessMask to match reality.
|
| |
|
|
|
|
|
| |
Allow reserving of stack slots for non-pointer data (eg. cost
centres). This means the previous hacks to keep the stack bitmaps
correct in the presence of cost centres are now unnecessary, and
case-of-case expressions will be compiled properly with profiling on.
|
| |
|
|
| |
Existential constructors NEVER WORKED! You were JUST IMAGINING IT!
|
| |
|
|
|
|
|
| |
Enable rules for simplification of SeqOp
Fix a related bug in WwLib that made it look as if the binder
in a case expression was being demanded, when it wasn't.
|
| |
|
|
| |
Several bugfixes (from SLPJ's tree).
|
| |
|
|
| |
RULES-NOTES
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Support for "unregisterised" builds. An unregisterised build doesn't
use the assembly mangler, doesn't do tail jumping (uses the
mini-interpreter), and doesn't use global register variables.
Plenty of cleanups and bugfixes in the process.
Add way 'u' to GhcLibWays to get unregisterised libs & RTS.
[ note: not *quite* working fully yet... there's still a bug or two
lurking ]
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(this is number 7 of 9 commits to be applied together)
The code generator now incorporates the update avoidance
optimisation: a thunk of __o type is now made SingleEntry rather
than Updatable.
We want to verify that SingleEntry thunks are indeed entered at most
once. In order to do this, -ticky turns on eager blackholing.
Ordinary thunks will be dealt with by the RTS, but CAFs are
blackholed by the code generator. We blackhole with new blackholes:
SE_CAF_BLACKHOLE. We will enter one of these if we attempt to enter
a SingleEntry thunk twice.
|
| |
|
|
|
|
| |
Fix bug in tagToEnum#: if the amode of the tag overlapped with node,
bogus code would be generated. Now load the tag into a temporary
before doing the table lookup.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Fix the tagToEnum# support in the code generator
- Make isDeadBinder work on case binders
- Fix compiling of
case x `op` y of z {
True -> ... z ...
False -> ... z ...
- Clean up CgCase a little.
- Don't generate specialised tag2con functions for derived Enum/Ix
instances; use tagToEnum# instead.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
- New Wired-in Id: getTag# :: a -> Int#
for a data type, returns the tag of the constructor.
for a function, returns a spurious number probably.
dataToTag# is the name of the underlying primitive which
pulls out the tag (its argument is assumed to be
evaluated).
- Generate constructor tables for enumerated types, so we
can do tagToEnum#.
- Remove hacks in CoreToStg for dataToTag#.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Support for
dataToTag# :: a -> Int# (if a is a data type)
and (partial) support for
tagToEnum# :: Int# -> a (if a is an enumerated type)
The con2tag functions generated by derived Eq,Ord and Enum instances
are now replaced by dataToTag# for data types with a large number of
constructors.
|
| |
|
|
|
| |
Remove hack to force setting the CCCS when we enter a function closure
defined inside a lambda. We use a more general solution now.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Profiling fixes:
Function closures which are inside a lambda now *set* the CCCS,
instead of possibly appending to it.
In Simplify.lhs: allow inlining imported functions when profiling.
What we really want to do is allow any top-level binding to be
inlined, but there doesn't seem to be an easy way to tell whether a
binding is top-level or not.
|
| |
|
|
| |
More profiling fixes.
|
| |
|
|
|
|
|
|
|
|
|
| |
Profiling fixes.
- top-level CAF CCSs now *append* themselves to the
current CCS when called.
- remove DICT stuff.
- fixes to the auto-scc annotating in the desugarer.
|
| |
|
|
| |
Fix cost centres on PAPs.
|
| |
|
|
| |
Previous commit broke let-no-escape. Fix it up again.
|
| |
|
|
| |
Fix cost centre restores for unboxed tuple alternatives.
|
| |
|
|
|
| |
Save a few bytes by ommitting the static link field on closures with
an empty SRT.
|
| |
|
|
| |
Fix bug in mkRegLiveness causing bogus heap checks to be generated on the Sparc.
|
| |
|
|
|
|
| |
Top-level non-updatable thunks get closure type FUN_STATIC, not
THUNK_STATIC. (helps the garbage collector decide where the static
link field should be).
|
| |
|
|
| |
Add missing default case to mkRegLiveness.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
- import list tweaks
- moved the code that decides that a StgCon really shouldn't
be mapped to a static constructor but an updateable thunk
if it contains lit-lits from the codegen into the CoreToStg
translation.
Added an extra case to this code to deal with StgCon's that contain
references to values that reside in a DLL, where we also have to
opt for an updateable thunk instead of a static constructor. Only
applies when compiling on/for Win32 platforms.
|
| |
|
|
| |
Some native codegen updates.
|
| |
|
|
| |
Undo bogus fix to CgCase.lhs
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Finally! This commits the ongoing saga of Simon's hygiene sweep
FUNCTIONALITY
~~~~~~~~~~~~~
a) The 'unused variable' warnings from the renamer work.
b) Better error messages here and there, esp type checker
c) Fixities for Haskell 98 (maybe I'd done that before)
d) Lazy reporting of name clashes for Haskell 98 (ditto)
HYGIENE
~~~~~~~
a) type OccName has its own module. OccNames are represented
by a single FastString, not three as in the last round. This
string is held in Z-encoded form; a decoding function decodes
for printing in user error messages. There's a nice tight
encoding for (,,,,,,,,,,,,,,,,,,,,,,,,,,,,,)
b) type Module is a proper ADT, in module OccName
c) type RdrName is a proper ADT, in its own module
d) type Name has a new, somwhat tidier, representation
e) much grunting in the renamer to get Provenances right.
This makes error messages look better (no spurious qualifiers)
|
| |
|
|
|
| |
- Add specialised closure types (CONSTR_p_n, THUNK_p_n, FUN_p_n)
- Add -T<n> RTS flag to specify the number of steps in younger generations.
|
| |
|
|
| |
Fix more uses of [n..m]
|
| |
|
|
| |
Fix two uses of [ e1 .. e2 ] in light of the new Haskell 98 semantics.
|
| |
|
|
|
| |
Resurrect ticky-ticky profiling. Not quite polished yet, but it
compiles and produces some reasonable-looking stats.
|
| |
|
|
| |
long long support: cleared up Real vs. virtual regs. confusion (I hope!)
|
| |
|
|
| |
Haskell 98 updates.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Assorted minor Haskell 98 changes:
* Maximal munch rule for "--" comments
* _ as lower-case letter, "_" is a reserved id. Prefixing unused
variable names in patterns with '_' causes the renamer not to
report such names as being unused.
* allow empty decls
* comprehensions are now list comprehensions, not monadic.
* use Monad.fail to signal pattern matching errors within
do expressions.
* remove record punning.
* empty contexts are now legal (go wild!)
* allow records with no fields
* allow newtypes with a labelled field
* default default is now (Integer, Double)
* turn off defaulting mechanism for args & res to a _ccall_.
* allow LHSs of the form (a -.- b) x = ...
* Main.main can now have type (IO a)
* nuked Void (and its use in the compiler sources.)
* deriving machinery for Enum now also generate 'succ' and 'pred'
method bindings.
|
| |
|
|
| |
Sort unboxed slots - part of the fix for large bitmaps.
|
| |
|
|
| |
trim import
|