summaryrefslogtreecommitdiff
path: root/docs/comm
diff options
context:
space:
mode:
Diffstat (limited to 'docs/comm')
-rw-r--r--docs/comm/exts/ndp.html360
-rw-r--r--docs/comm/exts/th.html197
-rw-r--r--docs/comm/feedback.html34
-rw-r--r--docs/comm/genesis/genesis.html82
-rw-r--r--docs/comm/genesis/makefiles.html51
-rw-r--r--docs/comm/genesis/modules.html164
-rw-r--r--docs/comm/index.html121
-rw-r--r--docs/comm/others.html60
-rw-r--r--docs/comm/rts-libs/foreignptr.html68
-rw-r--r--docs/comm/rts-libs/multi-thread.html445
-rw-r--r--docs/comm/rts-libs/non-blocking.html133
-rw-r--r--docs/comm/rts-libs/prelfound.html57
-rw-r--r--docs/comm/rts-libs/prelude.html121
-rw-r--r--docs/comm/rts-libs/primitives.html70
-rw-r--r--docs/comm/rts-libs/stgc.html45
-rw-r--r--docs/comm/rts-libs/threaded-rts.html126
-rw-r--r--docs/comm/the-beast/alien.html56
-rw-r--r--docs/comm/the-beast/basicTypes.html132
-rw-r--r--docs/comm/the-beast/coding-style.html230
-rw-r--r--docs/comm/the-beast/data-types.html242
-rw-r--r--docs/comm/the-beast/desugar.html156
-rw-r--r--docs/comm/the-beast/driver.html179
-rw-r--r--docs/comm/the-beast/fexport.html231
-rw-r--r--docs/comm/the-beast/ghci.html407
-rw-r--r--docs/comm/the-beast/main.html35
-rw-r--r--docs/comm/the-beast/mangler.html79
-rw-r--r--docs/comm/the-beast/modules.html80
-rw-r--r--docs/comm/the-beast/names.html169
-rw-r--r--docs/comm/the-beast/ncg.html749
-rw-r--r--docs/comm/the-beast/optimistic.html65
-rw-r--r--docs/comm/the-beast/prelude.html207
-rw-r--r--docs/comm/the-beast/renamer.html249
-rw-r--r--docs/comm/the-beast/simplifier.html86
-rw-r--r--docs/comm/the-beast/stg.html164
-rw-r--r--docs/comm/the-beast/syntax.html99
-rw-r--r--docs/comm/the-beast/typecheck.html316
-rw-r--r--docs/comm/the-beast/types.html179
-rw-r--r--docs/comm/the-beast/vars.html235
38 files changed, 0 insertions, 6479 deletions
diff --git a/docs/comm/exts/ndp.html b/docs/comm/exts/ndp.html
deleted file mode 100644
index 2c79d728d5..0000000000
--- a/docs/comm/exts/ndp.html
+++ /dev/null
@@ -1,360 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - Parallel Arrays</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>The GHC Commentary - Parallel Arrays</h1>
- <p>
- This section describes an experimental extension by high-performance
- arrays, which comprises special syntax for array types and array
- comprehensions, a set of optimising program transformations, and a set
- of special purpose libraries. The extension is currently only partially
- implemented, but the development will be tracked here.
- <p>
- Parallel arrays originally got their name from the aim to provide an
- architecture-independent programming model for a range of parallel
- computers. However, since experiments showed that the approach is also
- worthwhile for sequential array code, the emphasis has shifted to their
- parallel evaluation semantics: As soon as any element in a parallel
- array is demanded, all the other elements are evaluated, too. This
- makes parallel arrays more strict than <a
- href="http://haskell.org/onlinelibrary/array.html">standard Haskell 98
- arrays</a>, but also opens the door for a loop-based implementation
- strategy that leads to significantly more efficient code.
- <p>
- The programming model as well as the use of the <em>flattening
- transformation</em>, which is central to the approach, has its origin in
- the programming language <a
- href="http://www.cs.cmu.edu/~scandal/nesl.html">Nesl</a>.
-
- <h2>More Sugar: Special Syntax for Array Comprehensions</h2>
- <p>
- The option <code>-XParr</code>, which is a dynamic hsc option that can
- be reversed with <code>-XNoParr</code>, enables special syntax for
- parallel arrays, which is not essential to using parallel arrays, but
- makes for significantly more concise programs. The switch works by
- making the lexical analyser (located in <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/Lex.lhs"><code>Lex.lhs</code></a>)
- recognise the tokens <code>[:</code> and <code>:]</code>. Given that
- the additional productions in the parser (located in <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/Parser.y"><code>Parser.y</code></a>)
- cannot be triggered without the lexer generating the necessary tokens,
- there is no need to alter the behaviour of the parser.
- <p>
- The following additional syntax is accepted (the non-terminals are those
- from the <a href="http://haskell.org/onlinereport/">Haskell 98 language
- definition</a>):
- <p>
- <blockquote><pre>
-atype -> '[:' type ':] (parallel array type)
-
-aexp -> '[:' exp1 ',' ... ',' expk ':]' (explicit array, k >= 0)
- | '[:' exp1 [',' exp2] '..' exp3 ':]' (arithmetic array sequence)
- | '[:' exp '|' quals1 '|' ... '|' qualsn ':]' (array comprehension, n >= 1)
-
-quals -> qual1 ',' ... ',' qualn (qualifier list, n >= 1)
-
-apat -> '[:' pat1 ',' ... ',' patk ':]' (array pattern, k >= 0)
-</pre>
- </blockquote>
- <p>
- Moreover, the extended comprehension syntax that allows for <em>parallel
- qualifiers</em> (i.e., qualifiers separated by "<code>|</code>") is also
- supported in list comprehensions. In general, the similarity to the
- special syntax for list is obvious. The two main differences are that
- (a) arithmetic array sequences are always finite and (b)
- <code>[::]</code> is not treated as a constructor in expressions and
- patterns, but rather as a special case of the explicit array syntax.
- The former is a simple consequence of the parallel evaluation semantics
- of parallel arrays and the latter is due to arrays not being a
- constructor-based data type.
- <p>
- As a naming convention, types and functions that are concerned with
- parallel arrays usually contain the string <code>parr</code> or
- <code>PArr</code> (often as a prefix), and where corresponding types or
- functions for handling lists exist, the name is identical, except for
- containing the substring <code>parr</code> instead of <code>list</code>
- (possibly in caps).
- <p>
- The following implications are worth noting explicitly:
- <ul>
- <li>As the value and pattern <code>[::]</code> is an empty explicit
- parallel array (i.e., something of the form <code>ExplicitPArr ty
- []</code> in the AST). This is in contrast to lists, which use the
- nil-constructor instead. In the case of parallel arrays, using a
- constructor would be rather awkward, as it is not a constructor-based
- type. (This becomes rather clear in the desugarer.)
- <li>As a consequence, array patterns have the general form <code>[:p1,
- p2, ..., pn:]</code>, where <code>n</code> >= 0. Thus, two array
- patterns overlap iff they have the same length -- an important property
- for the pattern matching compiler.
- </ul>
-
- <h2>Prelude Support for Parallel Arrays</h2>
- <p>
- The Prelude module <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelPArr.lhs"><code>PrelPArr</code></a>
- defines the standard operations and their types on parallel arrays and
- provides a basic implementation based on boxed arrays. The interface of
- <code>PrelPArr</code> is oriented by H98's <code>PrelList</code>, but
- leaving out all functions that require infinite structures and adding
- frequently needed array operations, such as permutations. Parallel
- arrays are quite unqiue in that they use an entirely different
- representation as soon as the flattening transformation is activated,
- which is described further below. In particular, <code>PrelPArr</code>
- defines the type <code>[::]</code> and operations to create, process,
- and inspect parallel arrays. The type as well as the names of some of
- the operations are also hardwired in <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/TysWiredIn.lhs"><code>TysWiredIn</code></a>
- (see the definition of <code>parrTyCon</code> in this module) and <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/PrelNames.lhs"><code>PrelNames</code></a>.
- This is again very much like the case of lists, where the type is
- defined in <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelBase.lhs"><code>PrelBase</code></a>
- and similarly wired in; however, for lists the entirely
- constructor-based definition is exposed to user programs, which is not
- the case for parallel arrays.
-
- <h2>Desugaring Parallel Arrays</h2>
- <p>
- The parallel array extension requires the desugarer to replace all
- occurrences of (1) explicit parallel arrays, (2) array patterns, and (3)
- array comprehensions by a suitable combination of invocations of
- operations defined in <code>PrelPArr</code>.
-
- <h4>Explicit Parallel Arrays</h4>
- <p>
- These are by far the simplest to remove. We simply replace every
- occurrence of <code>[:<i>e<sub>1</sub></i>, ...,
- <i>e<sub>n</sub></i>:]</code> by
- <blockquote>
- <code>
- toP [<i>e<sub>1</sub></i>, ..., <i>e<sub>n</sub></i>]
- </code>
- </blockquote>
- <p>
- i.e., we build a list of the array elements, which is, then, converted
- into a parallel array.
-
- <h4>Parallel Array Patterns</h4>
- <p>
- Array patterns are much more tricky as element positions may contain
- further patterns and the <a
- href="../the-beast/desugar.html#patmat">pattern matching compiler</a>
- requires us to flatten all those out. But before we turn to the gory
- details, here first the basic idea: A flat array pattern matches exactly
- iff it's length corresponds to the length of the matched array. Hence,
- if we have a set of flat array patterns matching an array value
- <code>a</code>, it suffices to generate a Core <code>case</code>
- expression that scrutinises <code>lengthP a</code> and has one
- alternative for every length of array occuring in one of the patterns.
- Moreover, there needs to be a default case catching all other array
- lengths. In each alternative, array indexing (i.e., the functions
- <code>!:</code>) is used to bind array elements to the corresponding
- pattern variables. This sounds easy enough and is essentially what the
- parallel array equation of the function <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsUtils.lhs"><code>DsUtils</code></a><code>.mkCoAlgCaseMatchResult</code>
- does.
- <p>
- Unfortunately, however, the pattern matching compiler expects that it
- can turn (almost) any pattern into variable patterns, literals, or
- constructor applications by way of the functions <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Match.lhs"><code>Match</code></a><code>.tidy1</code>.
- And to make matters worse, some weird machinery in the module <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Check.lhs"><code>Check</code></a>
- insists on being able to reverse the process (essentially to pretty
- print patterns in warnings for incomplete or overlapping patterns).
- <p>
- The solution to this is an (unlimited) set of <em>fake</em> constructors
- for parallel arrays, courtesy of <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/TysWiredIn.lhs"><code>TysWiredIn</code></a><code>.parrFakeCon</code>.
- In other words, any pattern of the form <code>[:<i>p<sub>1</sub></i>,
- ..., <i>p<sub>n</sub></i>:]</code> is transformed into
- <blockquote>
- <code>
- MkPArray<i>n</i> <i>p<sub>1</sub></i> ... <i>p<sub>n</sub></i>
- </code>
- </blockquote>
- <p>
- by <code>Match.tidy1</code>, then, run through the rest of the pattern
- matching compiler, and finally, picked up by
- <code>DsUtils.mkCoAlgCaseMatchResult</code>, which converts it into a
- <code>case</code> expression as outlined above.
- <p>
- As an example consider the source expression
- <blockquote><pre>
-case v of
- [:x1:] -> e1
- [:x2, x3, x4:] -> e2
- _ -> e3</pre>
- </blockquote>
- <p>
- <code>Match.tidy1</code> converts it into a form that is equivalent to
- <blockquote><pre>
-case v of {
- MkPArr1 x1 -> e1;
- MkPArr2 x2 x3 x4 -> e2;
- _ -> e3;
-}</pre>
- </blockquote>
- <p>
- which <code>DsUtils.mkCoAlgCaseMatchResult</code> turns into the
- following Core code:
- <blockquote><pre>
- case lengthP v of
- Int# i# ->
- case i# of l {
- DFT -> e3
- 1 -> let x1 = v!:0 in e1
- 3 -> let x2 = v!:0; x2 = v!:1; x3 = v!:2 in e2
- }</pre>
- </blockquote>
-
- <h4>Parallel Array Comprehensions</h4>
- <p>
- The most challenging construct of the three are array comprehensions.
- In principle, it would be possible to transform them in essentially the
- same way as list comprehensions, but this would lead to abysmally slow
- code as desugaring of list comprehensions generates code that is
- optimised for sequential, constructor-based structures. In contrast,
- array comprehensions need to be transformed into code that solely relies
- on collective operations and avoids the creation of many small
- intermediate arrays.
- <p>
- The transformation is implemented by the function <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsListComp.lhs"><code>DsListComp</code></a><code>.dsPArrComp</code>.
- In the following, we denote this transformation function by the form
- <code>&lt;&lt;<i>e</i>&gt;&gt; pa ea</code>, where <code><i>e</i></code>
- is the comprehension to be compiled and the arguments <code>pa</code>
- and <code>ea</code> denote a pattern and the currently processed array
- expression, respectively. The invariant constraining these two
- arguments is that all elements in the array produced by <code>ea</code>
- will <em>successfully</em> match against <code>pa</code>.
- <p>
- Given a source-level comprehensions <code>[:e | qss:]</code>, we compile
- it with <code>&lt;&lt;[:e | qss:]&gt;&gt; () [:():]</code> using the
- rules
- <blockquote><pre>
-<<[:e' | :]>> pa ea = mapP (\pa -> e') ea
-<<[:e' | b , qs:]>> pa ea = <<[:e' | qs:]>> pa (filterP (\pa -> b) ea)
-<<[:e' | p <- e, qs:]>> pa ea =
- let ef = filterP (\x -> case x of {p -> True; _ -> False}) e
- in
- <<[:e' | qs:]>> (pa, p) (crossP ea ef)
-<<[:e' | let ds, qs:]>> pa ea =
- <<[:e' | qs:]>> (pa, (x_1, ..., x_n))
- (mapP (\v@pa -> (v, let ds in (x_1, ..., x_n))) ea)
-where
- {x_1, ..., x_n} = DV (ds) -- Defined Variables
-<<[:e' | qs | qss:]>> pa ea =
- <<[:e' | qss:]>> (pa, (x_1, ..., x_n))
- (zipP ea <<[:(x_1, ..., x_n) | qs:]>>)
-where
- {x_1, ..., x_n} = DV (qs)</pre>
- </blockquote>
- <p>
- We assume the denotation of <code>crossP</code> to be given by
- <blockquote><pre>
-crossP :: [:a:] -> [:b:] -> [:(a, b):]
-crossP a1 a2 = let
- len1 = lengthP a1
- len2 = lengthP a2
- x1 = concatP $ mapP (replicateP len2) a1
- x2 = concatP $ replicateP len1 a2
- in
- zipP x1 x2</pre>
- </blockquote>
- <p>
- For a more efficient implementation of <code>crossP</code>, see
- <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelPArr.lhs"><code>PrelPArr</code></a>.
- <p>
- Moreover, the following optimisations are important:
- <ul>
- <li>In the <code>p &lt;- e</code> rule, if <code>pa == ()</code>, drop
- it and simplify the <code>crossP ea e</code> to <code>e</code>.
- <li>We assume that fusion will optimise sequences of array processing
- combinators.
- <li>FIXME: Do we want to have the following function?
- <blockquote><pre>
-mapFilterP :: (a -> Maybe b) -> [:a:] -> [:b:]</pre>
- </blockquote>
- <p>
- Even with fusion <code>(mapP (\p -&gt; e) . filterP (\p -&gt;
- b))</code> may still result in redundant pattern matching
- operations. (Let's wait with this until we have seen what the
- Simplifier does to the generated code.)
- </ul>
-
- <h2>Doing Away With Nested Arrays: The Flattening Transformation</h2>
- <p>
- On the quest towards an entirely unboxed representation of parallel
- arrays, the flattening transformation is the essential ingredient. GHC
- uses a <a
- href="http://www.cse.unsw.edu.au/~chak/papers/CK00.html">substantially
- improved version</a> of the transformation whose original form was
- described by Blelloch &amp; Sabot. The flattening transformation
- replaces values of type <code>[:a:]</code> as well as functions
- operating on these values by alternative, more efficient data structures
- and functions.
- <p>
- The flattening machinery is activated by the option
- <code>-fflatten</code>, which is a static hsc option. It consists of
- two steps: function vectorisation and array specialisation. Only the
- first of those is implemented so far. If selected, the transformation
- is applied to a module in Core form immediately after the <a
- href="../the-beast/desugar.html">desugarer,</a> before the <a
- href="../the-beast/simplifier.html">Mighty Simplifier</a> gets to do its
- job. After vectorisation, the Core program can be dumped using the
- option <code>-ddump-vect</code>. The is a good reason for us to perform
- flattening immediately after the desugarer. In <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/HscMain.lhs"><code>HscMain</code></a><code>.hscRecomp</code>
- the so-called <em>persistent compiler state</em> is maintained, which
- contains all the information about imported interface files needed to
- lookup the details of imported names (e.g., during renaming and type
- checking). However, all this information is zapped before the
- simplifier is invoked (supposedly to reduce the space-consumption of
- GHC). As flattening has to get at all kinds of identifiers from Prelude
- modules, we need to do it before the relevant information in the
- persistent compiler state is gone.
-
- <p>
- As flattening generally requires all libraries to be compiled for
- flattening (just like profiling does), there is a <em>compiler way</em>
- <code>"ndp"</code>, which can be selected using the way option code
- <code>-ndp</code>. This option will automagically select
- <code>-XParr</code> and <code>-fflatten</code>.
-
- <h4><code>FlattenMonad</code></h4>
- <p>
- The module <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/ndpFlatten/FlattenMonad.lhs"><code>FlattenMonad</code></a>
- implements the monad <code>Flatten</code> that is used during
- vectorisation to keep track of various sets of bound variables and a
- variable substitution map; moreover, it provides a supply of new uniques
- and allows us to look up names in the persistent compiler state (i.e.,
- imported identifiers).
- <p>
- In order to be able to look up imported identifiers in the persistent
- compiler state, it is important that these identifies are included into
- the free variable lists computed by the renamer. More precisely, all
- names needed by flattening are included in the names produced by
- <code>RnEnv.getImplicitModuleFVs</code>. To avoid putting
- flattening-dependent lists of names into the renamer, the module
- <code>FlattenInfo</code> exports <code>namesNeededForFlattening</code>.
-
- [FIXME: It might be worthwhile to document in the non-Flattening part of
- the Commentary that the persistent compiler state is zapped after
- desugaring and how the free variables determined by the renamer imply
- which names are imported.]
-
- <p><small>
-<!-- hhmts start -->
-Last modified: Tue Feb 12 01:44:21 EST 2002
-<!-- hhmts end -->
- </small>
- </body>
-</html>
diff --git a/docs/comm/exts/th.html b/docs/comm/exts/th.html
deleted file mode 100644
index 539245db74..0000000000
--- a/docs/comm/exts/th.html
+++ /dev/null
@@ -1,197 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - Template Haskell</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>The GHC Commentary - Template Haskell</h1>
- <p>
- The Template Haskell (TH) extension to GHC adds a meta-programming
- facility in which all meta-level code is executed at compile time. The
- design of this extension is detailed in "Template Meta-programming for
- Haskell", Tim Sheard and Simon Peyton Jones, <a
- href="http://portal.acm.org/toc.cfm?id=581690&type=proceeding&coll=portal&dl=ACM&part=series&WantType=proceedings&idx=unknown&title=unknown">ACM
- SIGPLAN 2002 Haskell Workshop,</a> 2002. However, some of the details
- changed after the paper was published.
- </p>
-
- <h2>Meta Sugar</h2>
- <p>
- The extra syntax of TH (quasi-quote brackets, splices, and reification)
- is handled in the module <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsMeta.hs"><code>DsMeta</code></a>.
- In particular, the function <code>dsBracket</code> desugars the four
- types of quasi-quote brackets (<code>[|...|]</code>,
- <code>[p|...|]</code>, <code>[d|...|]</code>, and <code>[t|...|]</code>)
- and <code>dsReify</code> desugars the three types of reification
- operations (<code>reifyType</code>, <code>reifyDecl</code>, and
- <code>reifyFixity</code>).
- </p>
-
- <h3>Desugaring of Quasi-Quote Brackets</h3>
- <p>
- A term in quasi-quote brackets needs to be translated into Core code
- that, when executed, yields a <em>representation</em> of that term in
- the form of the abstract syntax trees defined in <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/libraries/template-haskell/Language/Haskell/TH/Syntax.hs"><code>Language.Haskell.TH.Syntax</code></a>.
- Within <code>DsMeta</code>, this is achieved by four functions
- corresponding to the four types of quasi-quote brackets:
- <code>repE</code> (for <code>[|...|]</code>), <code>repP</code> (for
- <code>[p|...|]</code>), <code>repTy</code> (for <code>[t|...|]</code>),
- and <code>repTopDs</code> (for <code>[d|...|]</code>). All four of
- these functions receive as an argument the GHC-internal Haskell AST of
- the syntactic form that they quote (i.e., arguments of type <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/hsSyn/HsExpr.lhs"><code>HsExpr</code></a><code>.HsExpr
- Name</code>, <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/hsSyn/HsPat.lhs"><code>HsPat</code></a><code>.HsPat Name</code>,
- <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/hsSyn/HsTypes.lhs"><code>HsType</code></a><code>.HsType
- Name</code>, and <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/hsSyn/HsDecls.lhs"><code>HsDecls</code></a><code>.HsGroup
- Name</code>, respectively).
- </p>
- <p>
- To increase the static type safety in <code>DsMeta</code>, the functions
- constructing representations do not just return plain values of type <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/coreSyn/CoreSyn.lhs"><code>CoreSyn</code></a>
- <code>.CoreExpr</code>; instead, <code>DsMeta</code> introduces a
- parametrised type <code>Core</code> whose dummy type parameter indicates
- the source-level type of the value computed by the corresponding Core
- expression. All construction of Core fragments in <code>DsMeta</code>
- is performed by smart constructors whose type signatures use the dummy
- type parameter to constrain the contexts in which they are applicable.
- For example, a function that builds a Core expression that evaluates to
- a TH type representation, which has type
- <code>Language.Haskell.TH.Syntax.Type</code>, would return a value of
- type
- </p>
- <blockquote>
- <pre>
-Core Language.Haskell.TH.Syntax.Type</pre>
- </blockquote>
-
- <h3>Desugaring of Reification Operators</h3>
- <p>
- The TH paper introduces four reification operators:
- <code>reifyType</code>, <code>reifyDecl</code>,
- <code>reifyFixity</code>, and <code>reifyLocn</code>. Of these,
- currently (= 9 Nov 2002), only the former two are implemented.
- </p>
- <p>
- The operator <code>reifyType</code> receives the name of a function or
- data constructor as its argument and yields a representation of this
- entity's type in the form of a value of type
- <code>TH.Syntax.Type</code>. Similarly, <code>reifyDecl</code> receives
- the name of a type and yields a representation of the type's declaration
- as a value of type <code>TH.Syntax.Decl</code>. The name of the reified
- entity is mapped to the GHC-internal representation of the entity by
- using the function <code>lookupOcc</code> on the name.
- </p>
-
- <h3>Representing Binding Forms</h3>
- <p>
- Care needs to be taken when constructing TH representations of Haskell
- terms that include binding forms, such as lambda abstractions or let
- bindings. To avoid name clashes, fresh names need to be generated for
- all defined identifiers. This is achieved via the routine
- <code>DsMeta.mkGenSym</code>, which, given a <code>Name</code>, produces
- a <code>Name</code> / <code>Id</code> pair (of type
- <code>GenSymBind</code>) that associates the given <code>Name</code>
- with a Core identifier that at runtime will be bound to a string that
- contains the fresh name. Notice the two-level nature of this
- arrangement. It is necessary, as the Core code that constructs the
- Haskell term representation may be executed multiple types at runtime
- and it must be ensured that different names are generated in each run.
- </p>
- <p>
- Such fresh bindings need to be entered into the meta environment (of
- type <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsMonad.lhs"><code>DsMonad</code></a><code>.DsMetaEnv</code>),
- which is part of the state (of type <code>DsMonad.DsEnv</code>)
- maintained in the desugarer monad (of type <code>DsMonad.DsM</code>).
- This is done using the function <code>DsMeta.addBinds</code>, which
- extends the current environment by a list of <code>GenSymBind</code>s
- and executes a subcomputation in this extended environment. Names can
- be looked up in the meta environment by way of the functions
- <code>DsMeta.lookupOcc</code> and <code>DsMeta.lookupBinder</code>; more
- details about the difference between these two functions can be found in
- the next subsection.
- </p>
- <p>
- NB: <code>DsMeta</code> uses <code>mkGenSym</code> only when
- representing terms that may be embedded into a context where names can
- be shadowed. For example, a lambda abstraction embedded into an
- expression can potentially shadow names defined in the context it is
- being embedded into. In contrast, this can never be the case for
- top-level declarations, such as data type declarations; hence, the type
- variables that a parametric data type declaration abstracts over are not
- being gensym'ed. As a result, variables in defining positions are
- handled differently depending on the syntactic construct in which they
- appear.
- </p>
-
- <h3>Binders Versus Occurrences</h3>
- <p>
- Name lookups in the meta environment of the desugarer use two functions
- with slightly different behaviour, namely <code>DsMeta.lookupOcc</code>
- and <code>lookupBinder</code>. The module <code>DsMeta</code> contains
- the following explanation as to the difference of these functions:
- </p>
- <blockquote>
- <pre>
-When we desugar [d| data T = MkT |]
-we want to get
- Data "T" [] [Con "MkT" []] []
-and *not*
- Data "Foo:T" [] [Con "Foo:MkT" []] []
-That is, the new data decl should fit into whatever new module it is
-asked to fit in. We do *not* clone, though; no need for this:
- Data "T79" ....
-
-But if we see this:
- data T = MkT
- foo = reifyDecl T
-
-then we must desugar to
- foo = Data "Foo:T" [] [Con "Foo:MkT" []] []
-
-So in repTopDs we bring the binders into scope with mkGenSyms and addBinds,
-but in dsReify we do not. And we use lookupOcc, rather than lookupBinder
-in repTyClD and repC.</pre>
- </blockquote>
- <p>
- This implies that <code>lookupOcc</code>, when it does not find the name
- in the meta environment, uses the function <code>DsMeta.globalVar</code>
- to construct the <em>original name</em> of the entity (cf. the TH paper
- for more details regarding original names). This name uniquely
- identifies the entity in the whole program and is in scope
- <em>independent</em> of whether the user name of the same entity is in
- scope or not (i.e., it may be defined in a different module without
- being explicitly imported) and has the form &lt;module&gt;:&lt;name&gt;.
- <strong>NB:</strong> Incidentally, the current implementation of this
- mechanisms facilitates breaking any abstraction barrier.
- </p>
-
- <h3>Known-key Names for Template Haskell</h3>
- <p>
- During the construction of representations, the desugarer needs to use a
- large number of functions defined in the library
- <code>Language.Haskell.TH.Syntax</code>. The names of these functions
- need to be made available to the compiler in the way outlined <a
- href="../the-beast/prelude.html">Primitives and the Prelude.</a>
- Unfortunately, any change to <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/PrelNames.lhs"><code>PrelNames</code></a>
- triggers a significant amount of recompilation. Hence, the names needed
- for TH are defined in <code>DsMeta</code> instead (at the end of the
- module). All library functions needed by TH are contained in the name
- set <code>DsMeta.templateHaskellNames</code>.
- </p>
-
- <p><small>
-<!-- hhmts start -->
-Last modified: Wed Nov 13 18:01:48 EST 2002
-<!-- hhmts end -->
- </small>
- </body>
-</html>
diff --git a/docs/comm/feedback.html b/docs/comm/feedback.html
deleted file mode 100644
index 1da8b10f29..0000000000
--- a/docs/comm/feedback.html
+++ /dev/null
@@ -1,34 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - Feedback</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>Feedback</h1>
- <p>
- <a href="mailto:chak@cse.unsw.edu.au">I</a> welcome any feedback on the
- material and in particular would appreciated comments on which parts of
- the document are incomprehensible or miss explanation -- e.g., due to
- the use of GHC speak that is explained nowhere (words like infotable or
- so). Moreover, I would be interested to know which areas of GHC you
- would like to see covered here.
- <p>
- For the moment is probably best if feedback is directed to
- <p>
- <blockquote>
- <a
- href="mailto:chak@cse.unsw.edu.au"><code>chak@cse.unsw.edu.au</code></a>
- </blockquote>
- <p>
- However, if there is sufficient interest, we might consider setting up a
- mailing list.
-
- <p><small>
-<!-- hhmts start -->
-Last modified: Wed Aug 8 00:11:42 EST 2001
-<!-- hhmts end -->
- </small>
- </body>
-</html>
diff --git a/docs/comm/genesis/genesis.html b/docs/comm/genesis/genesis.html
deleted file mode 100644
index 2ccdf5353a..0000000000
--- a/docs/comm/genesis/genesis.html
+++ /dev/null
@@ -1,82 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - Outline of the Genesis</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>The GHC Commentary - Outline of the Genesis</h1>
- <p>
- Building GHC happens in two stages: First you have to prepare the tree
- with <code>make boot</code>; and second, you build the compiler and
- associated libraries with <code>make all</code>. The <code>boot</code>
- stage builds some tools used during the main build process, generates
- parsers and other pre-computed source, and finally computes dependency
- information. There is considerable detail on the build process in GHC's
- <a
- href="http://ghc.haskell.org/trac/ghc/wiki/Building">Building Guide.</a>
-
- <h4>Debugging the Beast</h4>
- <p>
- If you are hacking the compiler or like to play with unstable
- development versions, chances are that the compiler someday just crashes
- on you. Then, it is a good idea to load the <code>core</code> into
- <code>gdb</code> as usual, but unfortunately there is usually not too
- much useful information.
- <p>
- The next step, then, is somewhat tedious. You should build a compiler
- producing programs with a runtime system that has debugging turned on
- and use that to build the crashing compiler. There are many sanity
- checks in the RTS, which may detect inconsistency before they lead to a
- crash and you may include more debugging information, which helps
- <code>gdb.</code> For a RTS with debugging turned on, add the following
- to <code>build.mk</code> (see also the comment in
- <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/mk/config.mk.in"><code>config.mk.in</code></a> that you find when searching for
- <code>GhcRtsHcOpts</code>):
-<blockquote><pre>
-GhcRtsHcOpts+=-optc-DDEBUG
-GhcRtsCcOpts+=-g
-EXTRA_LD_OPTS=-lbfd -liberty</pre></blockquote>
- <p>
- Then go into <code>fptools/ghc/rts</code> and <code>make clean boot &&
- make all</code>. With the resulting runtime system, you have to re-link
- the compiler. Go into <code>fptools/ghc/compiler</code>, delete the
- file <code>hsc</code> (up to version 4.08) or
- <code>ghc-&lt;version&gt;</code>, and execute <code>make all</code>.
- <p>
- The <code>EXTRA_LD_OPTS</code> are necessary as some of the debugging
- code uses the BFD library, which in turn requires <code>liberty</code>.
- I would also recommend (in 4.11 and from 5.0 upwards) adding these linker
- options to the files <code>package.conf</code> and
- <code>package.conf.inplace</code> in the directory
- <code>fptools/ghc/driver/</code> to the <code>extra_ld_opts</code> entry
- of the package <code>RTS</code>. Otherwise, you have to supply them
- whenever you compile and link a program with a compiler that uses the
- debugging RTS for the programs it produces.
- <p>
- To run GHC up to version 4.08 in <code>gdb</code>, first invoke the
- compiler as usual, but pass it the option <code>-v</code>. This will
- show you the exact invocation of the compiler proper <code>hsc</code>.
- Run <code>hsc</code> with these options in <code>gdb</code>. The
- development version 4.11 and stable releases from 5.0 on do no longer
- use the Perl driver; so, you can run them directly with gdb.
- <p>
- <strong>Debugging a compiler during building from HC files.</strong>
- If you are boot strapping the compiler on new platform from HC files and
- it crashes somewhere during the build (e.g., when compiling the
- libraries), do as explained above, but you may have to re-configure the
- build system with <code>--enable-hc-boot</code> before re-making the
- code in <code>fptools/ghc/driver/</code>.
- If you do this with a compiler up to version 4.08, run the build process
- with <code>make EXTRA_HC_OPTS=-v</code> to get the exact arguments with
- which you have to invoke <code>hsc</code> in <code>gdb</code>.
-
- <p><small>
-<!-- hhmts start -->
-Last modified: Sun Apr 24 22:16:30 CEST 2005
-<!-- hhmts end -->
- </small>
- </body>
-</html>
diff --git a/docs/comm/genesis/makefiles.html b/docs/comm/genesis/makefiles.html
deleted file mode 100644
index 7f01fb53ac..0000000000
--- a/docs/comm/genesis/makefiles.html
+++ /dev/null
@@ -1,51 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - Mindboggling Makefiles</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>The GHC Commentary - Mindboggling Makefiles</h1>
- <p>
- The size and structure of GHC's makefiles makes it quite easy to scream
- out loud - in pain - during the process of tracking down problems in the
- make system or when attempting to alter it. GHC's <a
- href="http://ghc.haskell.org/trac/ghc/wiki/Building">Building
- Guide</a> has valuable information on <a
- href="http://ghc.haskell.org/trac/ghc/wiki/Building/BuildSystem">the
- makefile architecture.</a>
-
- <h4>A maze of twisty little passages, all alike</h4>
- <p>
- The <code>fptools/</code> toplevel and the various project directories
- contain not only a <code>Makefile</code> each, but there are
- subdirectories of name <code>mk/</code> at various levels that contain
- rules, targets, and so on specific to a project - or, in the case of the
- toplevel, the default rules for the whole system. Each <code>mk/</code>
- directory contains a file <code>boilerplate.mk</code> that ties the
- various other makefiles together. Files called <code>target.mk</code>,
- <code>paths.mk</code>, and <code>suffix.mk</code> contain make targets,
- definitions of variables containing paths, and suffix rules,
- respectively.
- <p>
- One particularly nasty trick used in this hierarchy of makefiles is the
- way in which the variable <code>$(TOP)</code> is used. AFAIK,
- <code>$(TOP)</code> always points to a directory containing an
- <code>mk/</code> subdirectory; however, it not necessarily points to the
- toplevel <code>fptools/</code> directory. For example, within the GHC
- subtree, <code>$(TOP)</code> points to <code>fptools/ghc/</code>.
- However, some of the makefiles in <code>fptools/ghc/mk/</code> will then
- <em>temporarily</em> redefine <code>$(TOP)</code> to point a level
- higher (i.e., to <code>fptools/</code>) while they are including the
- toplevel boilerplate. After that <code>$(TOP)</code> is redefined to
- whatever value it had before including makefiles from higher up in the
- hierarchy.
-
- <p><small>
-<!-- hhmts start -->
-Last modified: Wed Aug 22 16:46:33 GMT Daylight Time 2001
-<!-- hhmts end -->
- </small>
- </body>
-</html>
diff --git a/docs/comm/genesis/modules.html b/docs/comm/genesis/modules.html
deleted file mode 100644
index 10cd7a8490..0000000000
--- a/docs/comm/genesis/modules.html
+++ /dev/null
@@ -1,164 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - The Marvellous Module Structure of GHC </title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>The GHC Commentary - The Marvellous Module Structure of GHC </h1>
- <p>
-
-GHC is built out of about 245 Haskell modules. It can be quite tricky
-to figure out what the module dependency graph looks like. It can be
-important, too, because loops in the module dependency graph need to
-be broken carefully using <tt>.hi-boot</tt> interface files.
-<p>
-This section of the commentary documents the subtlest part of
-the module dependency graph, namely the part near the bottom.
-<ul>
-<li> The list is given in compilation order: that is,
-module near the top are more primitive, and are compiled earlier.
-<li> Each module is listed together with its most critical
-dependencies in parentheses; that is, the dependencies that prevent it being
-compiled earlier.
-<li> Modules in the same bullet don't depend on each other.
-<li> Loops are documented by a dependency such as "<tt>loop Type.Type</tt>".
-This means tha the module imports <tt>Type.Type</tt>, but module <tt>Type</tt>
-has not yet been compiled, so the import comes from <tt>Type.hi-boot</tt>.
-</ul>
-
-Compilation order is as follows:
-<ul>
-<li>
-<strong>First comes a layer of modules that have few interdependencies,
-and which implement very basic data types</strong>:
-<tt> <ul>
-<li> Util
-<li> OccName
-<li> Pretty
-<li> Outputable
-<li> StringBuffer
-<li> ListSetOps
-<li> Maybes
-<li> etc
-</ul> </tt>
-
-<p>
-<li> <strong> Now comes the main subtle layer, involving types, classes, type constructors
-identifiers, expressions, rules, and their operations.</strong>
-
-<tt>
-<ul>
-<li> Name <br> PrimRep
-<p><li>
- PrelNames
-<p><li>
- Var (Name, loop IdInfo.IdInfo,
- loop Type.Type, loop Type.Kind)
-<p><li>
- VarEnv, VarSet, ThinAir
-<p><li>
- Class (loop TyCon.TyCon, loop Type.Type)
-<p><li>
- TyCon (loop Type.Type, loop Type.Kind, loop DataCon.DataCon, loop Generics.GenInfo)
-<p><li>
- TypeRep (loop DataCon.DataCon, loop Subst.substTyWith)
-<p><li>
- Type (loop PprType.pprType, loop Subst.substTyWith)
-<p><li>
- FieldLabel(Type) <br>
- TysPrim(Type) <br>
-<p><li>
- Literal (TysPrim, PprType) <br>
- DataCon (loop PprType, loop Subst.substTyWith, FieldLabel.FieldLabel)
-<p><li>
- TysWiredIn (loop MkId.mkDataConIds)
-<p><li>
- TcType( lots of TysWiredIn stuff)
-<p><li>
- PprType( lots of TcType stuff )
-<p><li>
- PrimOp (PprType, TysWiredIn)
-<p><li>
- CoreSyn [does not import Id]
-<p><li>
- IdInfo (CoreSyn.Unfolding, CoreSyn.CoreRules)
-<p><li>
- Id (lots from IdInfo)
-<p><li>
- CoreFVs <br>
- PprCore
-<p><li>
- CoreUtils (PprCore.pprCoreExpr, CoreFVs.exprFreeVars,
- CoreSyn.isEvaldUnfolding CoreSyn.maybeUnfoldingTemplate)
-<p><li>
- CoreLint( CoreUtils ) <br>
- OccurAnal (CoreUtils.exprIsTrivial) <br>
- CoreTidy (CoreUtils.exprArity ) <br>
-<p><li>
- CoreUnfold (OccurAnal.occurAnalyseGlobalExpr)
-<p><li>
- Subst (CoreUnfold.Unfolding, CoreFVs) <br>
- Generics (CoreUnfold.mkTopUnfolding) <br>
- Rules (CoreUnfold.Unfolding, PprCore.pprTidyIdRules)
-<p><li>
- MkId (CoreUnfold.mkUnfolding, Subst, Rules.addRule)
-<p><li>
- PrelInfo (MkId) <br>
- HscTypes( Rules.RuleBase )
-</ul></tt>
-
-<p><li> <strong>That is the end of the infrastructure. Now we get the
- main layer of modules that perform useful work.</strong>
-
-<tt><ul>
-<p><li>
- CoreTidy (HscTypes.PersistentCompilerState)
-</ul></tt>
-</ul>
-
-HsSyn stuff
-<ul>
-<li> HsPat.hs-boot
-<li> HsExpr.hs-boot (loop HsPat.LPat)
-<li> HsTypes (loop HsExpr.HsSplice)
-<li> HsBinds (HsTypes.LHsType, loop HsPat.LPat, HsExpr.pprFunBind and others)
- HsLit (HsTypes.SyntaxName)
-<li> HsPat (HsBinds, HsLit)
- HsDecls (HsBinds)
-<li> HsExpr (HsDecls, HsPat)
-</ul>
-
-
-
-<h2>Library stuff: base package</h2>
-
-<ul>
-<li> GHC.Base
-<li> Data.Tuple (GHC.Base), GHC.Ptr (GHC.Base)
-<li> GHC.Enum (Data.Tuple)
-<li> GHC.Show (GHC.Enum)
-<li> GHC.Num (GHC.Show)
-<li> GHC.ST (GHC.Num), GHC.Real (GHC.Num)
-<li> GHC.Arr (GHC.ST) GHC.STRef (GHC.ST)
-<li> GHC.IOBase (GHC.Arr)
-<li> Data.Bits (GHC.Real)
-<li> Data.HashTable (Data.Bits, Control.Monad)
-<li> Data.Typeable (GHC.IOBase, Data.HashTable)
-<li> GHC.Weak (Data.Typeable, GHC.IOBase)
-</ul>
-
-
- <p><small>
-<!-- hhmts start -->
-Last modified: Wed Aug 22 16:46:33 GMT Daylight Time 2001
-<!-- hhmts end -->
- </small>
- </body>
-</html>
-
-
-
-
-
diff --git a/docs/comm/index.html b/docs/comm/index.html
deleted file mode 100644
index 64b9d81ff1..0000000000
--- a/docs/comm/index.html
+++ /dev/null
@@ -1,121 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - The Beast Explained</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>The Glasgow Haskell Compiler (GHC) Commentary [v0.17]</h1>
- <p>
- <!-- Contributors: Whoever makes substantial additions or changes to the
- document, please add your name and keep the order alphabetic. Moreover,
- please bump the version number for any substantial modification that you
- check into CVS.
- -->
- <strong>Manuel M. T. Chakravarty</strong><br>
- <strong>Sigbjorn Finne</strong><br>
- <strong>Simon Marlow</strong><br>
- <strong>Simon Peyton Jones</strong><br>
- <strong>Julian Seward</strong><br>
- <strong>Reuben Thomas</strong><br>
- &nbsp;<br>
- <p>
- This document started as a collection of notes describing what <a
- href="mailto:chak@cse.unsw.edu.au">I</a> learnt when poking around in
- the <a href="http://haskell.org/ghc/">GHC</a> sources. During the
- <i>Haskell Implementers Workshop</i> in January 2001, it was decided to
- put the commentary into
- <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/">GHC's CVS
- repository</a>
- to allow the whole developer community to add their wizardly insight to
- the document.
- <p>
- <strong>The document is still far from being complete - help it
- grow!</strong>
-
- <h2>Before the Show Begins</h2>
- <p>
- <ul>
- <li><a href="feedback.html">Feedback</a>
- <li><a href="others.html">Other Sources of Wisdom</a>
- </ul>
-
- <h2>Genesis</h2>
- <p>
- <ul>
- <li><a href="genesis/genesis.html">Outline of the Genesis</a>
- <li><a href="genesis/makefiles.html">Mindboggling Makefiles</a>
- <li><a href="genesis/modules.html">GHC's Marvellous Module Structure</a>
- </ul>
-
- <h2>The Beast Dissected</h2>
- <p>
- <ul>
- <li><a href="the-beast/coding-style.html">Coding style used in
- the compiler</a>
- <li><a href="the-beast/driver.html">The Glorious Driver</a>
- <li><a href="the-beast/prelude.html">Primitives and the Prelude</a>
- <li><a href="the-beast/syntax.html">Just Syntax</a>
- <li><a href="the-beast/basicTypes.html">The Basics</a>
- <li><a href="the-beast/modules.html">Modules, ModuleNames and
- Packages</a>
- <li><a href="the-beast/names.html">The truth about names: Names and OccNames</a>
- <li><a href="the-beast/vars.html">The Real Story about Variables, Ids,
- TyVars, and the like</a>
- <li><a href="the-beast/data-types.html">Data types and constructors</a>
- <li><a href="the-beast/renamer.html">The Glorious Renamer</a>
- <li><a href="the-beast/types.html">Hybrid Types</a>
- <li><a href="the-beast/typecheck.html">Checking Types</a>
- <li><a href="the-beast/desugar.html">Sugar Free: From Haskell To Core</a>
- <li><a href="the-beast/simplifier.html">The Mighty Simplifier</a>
- <li><a href="the-beast/mangler.html">The Evil Mangler</a>
- <li><a href="the-beast/alien.html">Alien Functions</a>
- <li><a href="the-beast/stg.html">You Got Control: The STG-language</a>
- <li><a href="the-beast/ncg.html">The Native Code Generator</a>
- <li><a href="the-beast/ghci.html">GHCi</a>
- <li><a href="the-beast/fexport.html">Implementation of
- <code>foreign export</code></a>
- <li><a href="the-beast/main.html">Compiling and running the Main module</code></a>
- </ul>
-
- <h2>RTS &amp; Libraries</h2>
- <p>
- <ul>
- <li><a href="http://ghc.haskell.org/trac/ghc/wiki/Commentary/Rts/Conventions">Coding Style Guidelines</a>
- <li><a href="rts-libs/stgc.html">Spineless Tagless C</a>
- <li><a href="rts-libs/primitives.html">Primitives</a>
- <li><a href="rts-libs/prelfound.html">Prelude Foundations</a>
- <li><a href="rts-libs/prelude.html">Cunning Prelude Code</a>
- <li><a href="rts-libs/foreignptr.html">On why we have <tt>ForeignPtr</tt></a>
- <li><a href="rts-libs/non-blocking.html">Non-blocking I/O for Win32</a>
- <li><a href="rts-libs/multi-thread.html">Supporting multi-threaded interoperation</a>
- </ul>
-
- <h2>Extensions, or Making a Complicated System More Complicated</h2>
- <p>
- <ul>
- <li><a href="exts/th.html">Template Haskell</a>
- <li><a href="exts/ndp.html">Parallel Arrays</a>
- </ul>
-
- <h2>The Source</h2>
- <p>
- The online master copy of the Commentary is at
- <blockquote>
- <a href="http://www.cse.unsw.edu.au/~chak/haskell/ghc/comm/">http://www.cse.unsw.edu.au/~chak/haskell/ghc/comm/</a>
- </blockquote>
- <p>
- This online version is updated
- <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/docs/comm/">from
- CVS</a>
- daily.
-
- <p><small>
-<!-- hhmts start -->
-Last modified: Thu May 12 19:03:42 EST 2005
-<!-- hhmts end -->
- </small>
- </body>
-</html>
diff --git a/docs/comm/others.html b/docs/comm/others.html
deleted file mode 100644
index 52d87e9419..0000000000
--- a/docs/comm/others.html
+++ /dev/null
@@ -1,60 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - Other Sources of Wisdom</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>Other Sources of Wisdom</h1>
- <p>
- Believe it or not, but there are other people besides you who are
- masochistic enough to study the innards of the beast. Some of the have
- been kind (or cruel?) enough to share their insights with us. Here is a
- probably incomplete list:
- <p>
- <ul>
-
- <li>The <a
- href="http://www.cee.hw.ac.uk/~dsg/gph/docs/StgSurvival.ps.gz">STG
- Survival Sheet</a> has -- according to its header -- been written by
- `a poor wee soul',<sup><a href="#footnote1">1</a></sup> which
- probably has been pushed into the torments of madness by the very
- act of contemplating the inner workings of the STG runtime system.
- This document discusses GHC's runtime system with a focus on
- support for parallel processing (aka GUM).
-
- <li>Instructions on <a
- href="http://www-users.cs.york.ac.uk/~olaf/PUBLICATIONS/extendGHC.html">Adding
- an Optimisation Pass to the Glasgow Haskell Compiler</a>
- have been compiled by <a
- href="http://www-users.cs.york.ac.uk/~olaf/">Olaf Chitil</a>.
- Unfortunately, this document is already a little aged.
-
- <li><a href="http://www.cs.pdx.edu/~apt/">Andrew Tolmach</a> has defined
- <a href="http://www.haskell.org/ghc/docs/papers/core.ps.gz">an external
- representation of
- GHC's <em>Core</em> language</a> and also implemented a GHC pass
- that emits the intermediate form into <code>.hcr</code> files. The
- option <code>-fext-core</code> triggers GHC to emit Core code after
- optimisation; in addition, <code>-fno-code</code> is often used to
- stop compilation after Core has been emitted.
-
- <!-- Add references to other background texts listed on the GHC docu
- page
- -->
-
- </ul>
-
- <p><hr><p>
- <sup><a name="footnote1">1</a></sup>Usually reliable sources have it that
- the poor soul in question is no one less than GUM hardcore hacker <a
- href="http://www.cee.hw.ac.uk/~hwloidl/">Hans-Wolfgang Loidl</a>.
-
- <p><small>
-<!-- hhmts start -->
-Last modified: Tue Nov 13 10:56:57 EST 2001
-<!-- hhmts end -->
- </small>
- </body>
-</html>
diff --git a/docs/comm/rts-libs/foreignptr.html b/docs/comm/rts-libs/foreignptr.html
deleted file mode 100644
index febe9fe422..0000000000
--- a/docs/comm/rts-libs/foreignptr.html
+++ /dev/null
@@ -1,68 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - why we have <tt>ForeignPtr</tt></title>
- </head>
-
- <body BGCOLOR="FFFFFF">
-
- <h1>On why we have <tt>ForeignPtr</tt></h1>
-
- <p>Unfortunately it isn't possible to add a finalizer to a normal
- <tt>Ptr a</tt>. We already have a generic finalization mechanism:
- see the Weak module in package lang. But the only reliable way to
- use finalizers is to attach one to an atomic heap object - that
- way the compiler's optimiser can't interfere with the lifetime of
- the object.
-
- <p>The <tt>Ptr</tt> type is really just a boxed address - it's
- defined like
-
- <pre>
-data Ptr a = Ptr Addr#
-</pre>
-
- <p>where <tt>Addr#</tt> is an unboxed native address (just a 32-
- or 64- bit word). Putting a finalizer on a <tt>Ptr</tt> is
- dangerous, because the compiler's optimiser might remove the box
- altogether.
-
- <p><tt>ForeignPtr</tt> is defined like this
-
- <pre>
-data ForeignPtr a = ForeignPtr ForeignObj#
-</pre>
-
- <p>where <tt>ForeignObj#</tt> is a *boxed* address, it corresponds
- to a real heap object. The heap object is primitive from the
- point of view of the compiler - it can't be optimised away. So it
- works to attach a finalizer to the <tt>ForeignObj#</tt> (but not
- to the <tt>ForeignPtr</tt>!).
-
- <p>There are several primitive objects to which we can attach
- finalizers: <tt>MVar#</tt>, <tt>MutVar#</tt>, <tt>ByteArray#</tt>,
- etc. We have special functions for some of these: eg.
- <tt>MVar.addMVarFinalizer</tt>.
-
- <p>So a nicer interface might be something like
-
-<pre>
-class Finalizable a where
- addFinalizer :: a -> IO () -> IO ()
-
-instance Finalizable (ForeignPtr a) where ...
-instance Finalizable (MVar a) where ...
-</pre>
-
- <p>So you might ask why we don't just get rid of <tt>Ptr</tt> and
- rename <tt>ForeignPtr</tt> to <tt>Ptr</tt>. The reason for that
- is just efficiency, I think.
-
- <p><small>
-<!-- hhmts start -->
-Last modified: Wed Sep 26 09:49:37 BST 2001
-<!-- hhmts end -->
- </small>
- </body>
-</html>
diff --git a/docs/comm/rts-libs/multi-thread.html b/docs/comm/rts-libs/multi-thread.html
deleted file mode 100644
index 67a544be85..0000000000
--- a/docs/comm/rts-libs/multi-thread.html
+++ /dev/null
@@ -1,445 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
-<head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
-<title>The GHC Commentary - Supporting multi-threaded interoperation</title>
-</head>
-<body>
-<h1>The GHC Commentary - Supporting multi-threaded interoperation</h1>
-<em>
-<p>
-Authors: sof@galois.com, simonmar@microsoft.com<br>
-Date: April 2002
-</p>
-</em>
-<p>
-This document presents the implementation of an extension to
-Concurrent Haskell that provides two enhancements:
-</p>
-<ul>
-<li>A Concurrent Haskell thread may call an external (e.g., C)
-function in a manner that's transparent to the execution/evaluation of
-other Haskell threads. Section <a href="#callout">Calling out"</a> covers this.
-</li>
-<li>
-OS threads may safely call Haskell functions concurrently. Section
-<a href="#callin">"Calling in"</a> covers this.
-</li>
-</ul>
-
-<!---- *************************************** ----->
-<h2 id="callout">The problem: foreign calls that block</h2>
-<p>
-When a Concurrent Haskell(CH) thread calls a 'foreign import'ed
-function, the runtime system(RTS) has to handle this in a manner
-transparent to other CH threads. That is, they shouldn't be blocked
-from making progress while the CH thread executes the external
-call. Presently, all threads will block.
-</p>
-<p>
-Clearly, we have to rely on OS-level threads in order to support this
-kind of concurrency. The implementation described here defines the
-(abstract) OS threads interface that the RTS assumes. The implementation
-currently provides two instances of this interface, one for POSIX
-threads (pthreads) and one for the Win32 threads.
-</p>
-
-<!---- *************************************** ----->
-<h3>Multi-threading the RTS</h3>
-
-<p>
-A simple and efficient way to implement non-blocking foreign calls is like this:
-<ul>
-<li> Invariant: only one OS thread is allowed to
-execute code inside of the GHC runtime system. [There are alternate
-designs, but I won't go into details on their pros and cons here.]
-We'll call the OS thread that is currently running Haskell threads
-the <em>Current Haskell Worker Thread</em>.
-<p>
-The Current Haskell Worker Thread repeatedly grabs a Haskell thread, executes it until its
-time-slice expires or it blocks on an MVar, then grabs another, and executes
-that, and so on.
-</p>
-<li>
-<p>
-When the Current Haskell Worker comes to execute a potentially blocking 'foreign
-import', it leaves the RTS and ceases being the Current Haskell Worker, but before doing so it makes certain that
-another OS worker thread is available to become the Current Haskell Worker.
-Consequently, even if the external call blocks, the new Current Haskell Worker
-continues execution of the other Concurrent Haskell threads.
-When the external call eventually completes, the Concurrent Haskell
-thread that made the call is passed the result and made runnable
-again.
-</p>
-<p>
-<li>
-A pool of OS threads are constantly trying to become the Current Haskell Worker.
-Only one succeeds at any moment. If the pool becomes empty, the RTS creates more workers.
-<p><li>
-The OS worker threads are regarded as interchangeable. A given Haskell thread
-may, during its lifetime, be executed entirely by one OS worker thread, or by more than one.
-There's just no way to tell.
-
-<p><li>If a foreign program wants to call a Haskell function, there is always a thread switch involved.
-The foreign program uses thread-safe mechanisms to create a Haskell thread and make it runnable; and
-the current Haskell Worker Thread exectutes it. See Section <a href="#callin">Calling in</a>.
-</ul>
-<p>
-The rest of this section describes the mechanics of implementing all
-this. There's two parts to it, one that describes how a native (OS) thread
-leaves the RTS to service the external call, the other how the same
-thread handles returning the result of the external call back to the
-Haskell thread.
-</p>
-
-<!---- *************************************** ----->
-<h3>Making the external call</h3>
-
-<p>
-Presently, GHC handles 'safe' C calls by effectively emitting the
-following code sequence:
-</p>
-
-<pre>
- ...save thread state...
- t = suspendThread();
- r = foo(arg1,...,argn);
- resumeThread(t);
- ...restore thread state...
- return r;
-</pre>
-
-<p>
-After having squirreled away the state of a Haskell thread,
-<tt>Schedule.c:suspendThread()</tt> is called which puts the current
-thread on a list [<tt>Schedule.c:suspended_ccalling_threads</tt>]
-containing threads that are currently blocked waiting for external calls
-to complete (this is done for the purposes of finding roots when
-garbage collecting).
-</p>
-
-<p>
-In addition to putting the Haskell thread on
-<tt>suspended_ccalling_threads</tt>, <tt>suspendThread()</tt> now also
-does the following:
-</p>
-<ul>
-<li>Instructs the <em>Task Manager</em> to make sure that there's a
-another native thread waiting in the wings to take over the execution
-of Haskell threads. This might entail creating a new
-<em>worker thread</em> or re-using one that's currently waiting for
-more work to do. The <a href="#taskman">Task Manager</a> section
-presents the functionality provided by this subsystem.
-</li>
-
-<li>Releases its capability to execute within the RTS. By doing
-so, another worker thread will become unblocked and start executing
-code within the RTS. See the <a href="#capability">Capability</a>
-section for details.
-</li>
-
-<li><tt>suspendThread()</tt> returns a token which is used to
-identify the Haskell thread that was added to
-<tt>suspended_ccalling_threads</tt>. This is done so that once the
-external call has completed, we know what Haskell thread to pull off
-the <tt>suspended_ccalling_threads</tt> list.
-</li>
-</ul>
-
-<p>
-Upon return from <tt>suspendThread()</tt>, the OS thread is free of
-its RTS executing responsibility, and can now invoke the external
-call. Meanwhile, the other worker thread that have now gained access
-to the RTS will continue executing Concurrent Haskell code. Concurrent
-'stuff' is happening!
-</p>
-
-<!---- *************************************** ----->
-<h3>Returning the external result</h3>
-
-<p>
-When the native thread eventually returns from the external call,
-the result needs to be communicated back to the Haskell thread that
-issued the external call. The following steps takes care of this:
-</p>
-
-<ul>
-<li>The returning OS thread calls <tt>Schedule.c:resumeThread()</tt>,
-passing along the token referring to the Haskell thread that made the
-call we're returning from.
-</li>
-
-<li>
-The OS thread then tries to grab hold of a <em>returning worker
-capability</em>, via <tt>Capability.c:grabReturnCapability()</tt>.
-Until granted, the thread blocks waiting for RTS permissions. Clearly we
-don't want the thread to be blocked longer than it has to, so whenever
-a thread that is executing within the RTS enters the Scheduler (which
-is quite often, e.g., when a Haskell thread context switch is made),
-it checks to see whether it can give up its RTS capability to a
-returning worker, which is done by calling
-<tt>Capability.c:yieldToReturningWorker()</tt>.
-</li>
-
-<li>
-If a returning worker is waiting (the code in <tt>Capability.c</tt>
-keeps a counter of the number of returning workers that are currently
-blocked waiting), it is woken up and the given the RTS execution
-priviledges/capabilities of the worker thread that gave up its.
-</li>
-
-<li>
-The thread that gave up its capability then tries to re-acquire
-the capability to execute RTS code; this is done by calling
-<tt>Capability.c:waitForWorkCapability()</tt>.
-</li>
-
-<li>
-The returning worker that was woken up will continue execution in
-<tt>resumeThread()</tt>, removing its associated Haskell thread
-from the <tt>suspended_ccalling_threads</tt> list and start evaluating
-that thread, passing it the result of the external call.
-</li>
-</ul>
-
-<!---- *************************************** ----->
-<h3 id="rts-exec">RTS execution</h3>
-
-<p>
-If a worker thread inside the RTS runs out of runnable Haskell
-threads, it goes to sleep waiting for the external calls to complete.
-It does this by calling <tt>waitForWorkCapability</tt>
-</p>
-
-<p>
-The availability of new runnable Haskell threads is signalled when:
-</p>
-
-<ul>
-<li>When an external call is set up in <tt>suspendThread()</tt>.</li>
-<li>When a new Haskell thread is created (e.g., whenever
-<tt>Concurrent.forkIO</tt> is called from within Haskell); this is
-signalled in <tt>Schedule.c:scheduleThread_()</tt>.
-</li>
-<li>Whenever a Haskell thread is removed from a 'blocking queue'
-attached to an MVar (only?).
-</li>
-</ul>
-
-<!---- *************************************** ----->
-<h2 id="callin">Calling in</h2>
-
-Providing robust support for having multiple OS threads calling into
-Haskell is not as involved as its dual.
-
-<ul>
-<li>The OS thread issues the call to a Haskell function by going via
-the <em>Rts API</em> (as specificed in <tt>RtsAPI.h</tt>).
-<li>Making the function application requires the construction of a
-closure on the heap. This is done in a thread-safe manner by having
-the OS thread lock a designated block of memory (the 'Rts API' block,
-which is part of the GC's root set) for the short period of time it
-takes to construct the application.
-<li>The OS thread then creates a new Haskell thread to execute the
-function application, which (eventually) boils down to calling
-<tt>Schedule.c:createThread()</tt>
-<li>
-Evaluation is kicked off by calling <tt>Schedule.c:scheduleExtThread()</tt>,
-which asks the Task Manager to possibly create a new worker (OS)
-thread to execute the Haskell thread.
-<li>
-After the OS thread has done this, it blocks waiting for the
-Haskell thread to complete the evaluation of the Haskell function.
-<p>
-The reason why a separate worker thread is made to evaluate the Haskell
-function and not the OS thread that made the call-in via the
-Rts API, is that we want that OS thread to return as soon as possible.
-We wouldn't be able to guarantee that if the OS thread entered the
-RTS to (initially) just execute its function application, as the
-Scheduler may side-track it and also ask it to evaluate other Haskell threads.
-</li>
-</ul>
-
-<p>
-<strong>Note:</strong> As of 20020413, the implementation of the RTS API
-only serializes access to the allocator between multiple OS threads wanting
-to call into Haskell (via the RTS API.) It does not coordinate this access
-to the allocator with that of the OS worker thread that's currently executing
-within the RTS. This weakness/bug is scheduled to be tackled as part of an
-overhaul/reworking of the RTS API itself.
-
-
-<!---- *************************************** ----->
-<h2>Subsystems introduced/modified</h2>
-
-<p>
-These threads extensions affect the Scheduler portions of the runtime
-system. To make it more manageable to work with, the changes
-introduced a couple of new RTS 'sub-systems'. This section presents
-the functionality and API of these sub-systems.
-</p>
-
-<!---- *************************************** ----->
-<h3 id="#capability">Capabilities</h3>
-
-<p>
-A Capability represent the token required to execute STG code,
-and all the state an OS thread/task needs to run Haskell code:
-its STG registers, a pointer to its TSO, a nursery etc. During
-STG execution, a pointer to the capabilitity is kept in a
-register (BaseReg).
-</p>
-<p>
-Only in an SMP build will there be multiple capabilities, for
-the threaded RTS and other non-threaded builds, there is only
-one global capability, namely <tt>MainCapability</tt>.
-
-<p>
-The Capability API is as follows:
-<pre>
-/* Capability.h */
-extern void initCapabilities(void);
-
-extern void grabReturnCapability(Mutex* pMutex, Capability** pCap);
-extern void waitForWorkCapability(Mutex* pMutex, Capability** pCap, rtsBool runnable);
-extern void releaseCapability(Capability* cap);
-
-extern void yieldToReturningWorker(Mutex* pMutex, Capability* cap);
-
-extern void grabCapability(Capability** cap);
-</pre>
-
-<ul>
-<li><tt>initCapabilities()</tt> initialises the subsystem.
-
-<li><tt>grabReturnCapability()</tt> is called by worker threads
-returning from an external call. It blocks them waiting to gain
-permissions to do so.
-
-<li><tt>waitForWorkCapability()</tt> is called by worker threads
-already inside the RTS, but without any work to do. It blocks them
-waiting for there to new work to become available.
-
-<li><tt>releaseCapability()</tt> hands back a capability. If a
-'returning worker' is waiting, it is signalled that a capability
-has become available. If not, <tt>releaseCapability()</tt> tries
-to signal worker threads that are blocked waiting inside
-<tt>waitForWorkCapability()</tt> that new work might now be
-available.
-
-<li><tt>yieldToReturningWorker()</tt> is called by the worker thread
-that's currently inside the Scheduler. It checks whether there are other
-worker threads waiting to return from making an external call. If so,
-they're given preference and a capability is transferred between worker
-threads. One of the waiting 'returning worker' threads is signalled and made
-runnable, with the other, yielding, worker blocking to re-acquire
-a capability.
-</ul>
-
-<p>
-The condition variables used to implement the synchronisation between
-worker consumers and providers are local to the Capability
-implementation. See source for details and comments.
-</p>
-
-<!---- *************************************** ----->
-<h3 id="taskman">The Task Manager</h3>
-
-<p>
-The Task Manager API is responsible for managing the creation of
-OS worker RTS threads. When a Haskell thread wants to make an
-external call, the Task Manager is asked to possibly create a
-new worker thread to take over the RTS-executing capability of
-the worker thread that's exiting the RTS to execute the external call.
-
-<p>
-The Capability subsystem keeps track of idle worker threads, so
-making an informed decision about whether or not to create a new OS
-worker thread is easy work for the task manager. The Task manager
-provides the following API:
-</p>
-
-<pre>
-/* Task.h */
-extern void startTaskManager ( nat maxTasks, void (*taskStart)(void) );
-extern void stopTaskManager ( void );
-
-extern void startTask ( void (*taskStart)(void) );
-</pre>
-
-<ul>
-<li><tt>startTaskManager()</tt> and <tt>stopTaskManager()</tt> starts
-up and shuts down the subsystem. When starting up, you have the option
-to limit the overall number of worker threads that can be
-created. An unbounded (modulo OS thread constraints) number of threads
-is created if you pass '0'.
-<li><tt>startTask()</tt> is called when a worker thread calls
-<tt>suspendThread()</tt> to service an external call, asking another
-worker thread to take over its RTS-executing capability. It is also
-called when an external OS thread invokes a Haskell function via the
-<em>Rts API</em>.
-</ul>
-
-<!---- *************************************** ----->
-<h3>Native threads API</h3>
-
-To hide OS details, the following API is used by the task manager and
-the scheduler to interact with an OS' threads API:
-
-<pre>
-/* OSThreads.h */
-typedef <em>..OS specific..</em> Mutex;
-extern void initMutex ( Mutex* pMut );
-extern void grabMutex ( Mutex* pMut );
-extern void releaseMutex ( Mutex* pMut );
-
-typedef <em>..OS specific..</em> Condition;
-extern void initCondition ( Condition* pCond );
-extern void closeCondition ( Condition* pCond );
-extern rtsBool broadcastCondition ( Condition* pCond );
-extern rtsBool signalCondition ( Condition* pCond );
-extern rtsBool waitCondition ( Condition* pCond,
- Mutex* pMut );
-
-extern OSThreadId osThreadId ( void );
-extern void shutdownThread ( void );
-extern void yieldThread ( void );
-extern int createOSThread ( OSThreadId* tid,
- void (*startProc)(void) );
-</pre>
-
-
-
-<!---- *************************************** ----->
-<h2>User-level interface</h2>
-
-To signal that you want an external call to be serviced by a separate
-OS thread, you have to add the attribute <tt>threadsafe</tt> to
-a foreign import declaration, i.e.,
-
-<pre>
-foreign import "bigComp" threadsafe largeComputation :: Int -> IO ()
-</pre>
-
-<p>
-The distinction between 'safe' and thread-safe C calls is made
-so that we may call external functions that aren't re-entrant but may
-cause a GC to occur.
-<p>
-The <tt>threadsafe</tt> attribute subsumes <tt>safe</tt>.
-</p>
-
-<!---- *************************************** ----->
-<h2>Building the GHC RTS</h2>
-
-The multi-threaded extension isn't currently enabled by default. To
-have it built, you need to run the <tt>fptools</tt> configure script
-with the extra option <tt>--enable-threaded-rts</tt> turned on, and
-then proceed to build the compiler as per normal.
-
-<hr>
-<small>
-<!-- hhmts start --> Last modified: Wed Apr 10 14:21:57 Pacific Daylight Time 2002 <!-- hhmts end -->
-</small>
-</body> </html>
-
diff --git a/docs/comm/rts-libs/non-blocking.html b/docs/comm/rts-libs/non-blocking.html
deleted file mode 100644
index 627bde8d88..0000000000
--- a/docs/comm/rts-libs/non-blocking.html
+++ /dev/null
@@ -1,133 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - Non-blocking I/O on Win32</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>The GHC Commentary - Non-blocking I/O on Win32</h1>
- <p>
-
-This note discusses the implementation of non-blocking I/O on
-Win32 platforms. It is not implemented yet (Apr 2002), but it seems worth
-capturing the ideas. Thanks to Sigbjorn for writing them.
-
-<h2> Background</h2>
-
-GHC has provided non-blocking I/O support for Concurrent Haskell
-threads on platforms that provide 'UNIX-style' non-blocking I/O for
-quite a while. That is, platforms that let you alter the property of a
-file descriptor to instead of having a thread block performing an I/O
-operation that cannot be immediately satisfied, the operation returns
-back a special error code (EWOULDBLOCK.) When that happens, the CH
-thread that made the blocking I/O request is put into a blocked-on-IO
-state (see Foreign.C.Error.throwErrnoIfRetryMayBlock). The RTS will
-in a timely fashion check to see whether I/O is again possible
-(via a call to select()), and if it is, unblock the thread & have it
-re-try the I/O operation. The result is that other Concurrent Haskell
-threads won't be affected, but can continue operating while a thread
-is blocked on I/O.
-<p>
-Non-blocking I/O hasn't been supported by GHC on Win32 platforms, for
-the simple reason that it doesn't provide the OS facilities described
-above.
-
-<h2>Win32 non-blocking I/O, attempt 1</h2>
-
-Win32 does provide something select()-like, namely the
-WaitForMultipleObjects() API. It takes an array of kernel object
-handles plus a timeout interval, and waits for either one (or all) of
-them to become 'signalled'. A handle representing an open file (for
-reading) becomes signalled once there is input available.
-<p>
-So, it is possible to observe that I/O is possible using this
-function, but not whether there's "enough" to satisfy the I/O request.
-So, if we were to mimic select() usage with WaitForMultipleObjects(),
-we'd correctly avoid blocking initially, but a thread may very well
-block waiting for their I/O requests to be satisified once the file
-handle has become signalled. [There is a fix for this -- only read
-and write one byte at a the time -- but I'm not advocating that.]
-
-
-<h2>Win32 non-blocking I/O, attempt 2</h2>
-
-Asynchronous I/O on Win32 is supported via 'overlapped I/O'; that is,
-asynchronous read and write requests can be made via the ReadFile() /
-WriteFile () APIs, specifying position and length of the operation.
-If the I/O requests cannot be handled right away, the APIs won't
-block, but return immediately (and report ERROR_IO_PENDING as their
-status code.)
-<p>
-The completion of the request can be reported in a number of ways:
-<ul>
- <li> synchronously, by blocking inside Read/WriteFile(). (this is the
- non-overlapped case, really.)
-<p>
-
- <li> as part of the overlapped I/O request, pass a HANDLE to an event
- object. The I/O system will signal this event once the request
- completed, which a waiting thread will then be able to see.
-<p>
-
- <li> by supplying a pointer to a completion routine, which will be
- called as an Asynchronous Procedure Call (APC) whenever a thread
- calls a select bunch of 'alertable' APIs.
-<p>
-
- <li> by associating the file handle with an I/O completion port. Once
- the request completes, the thread servicing the I/O completion
- port will be notified.
-</ul>
-The use of I/O completion port looks the most interesting to GHC,
-as it provides a central point where all I/O requests are reported.
-<p>
-Note: asynchronous I/O is only fully supported by OSes based on
-the NT codebase, i.e., Win9x don't permit async I/O on files and
-pipes. However, Win9x does support async socket operations, and
-I'm currently guessing here, console I/O. In my view, it would
-be acceptable to provide non-blocking I/O support for NT-based
-OSes only.
-<p>
-Here's the design I currently have in mind:
-<ul>
-<li> Upon startup, an RTS helper thread whose only purpose is to service
- an I/O completion port, is created.
-<p>
-<li> All files are opened in 'overlapping' mode, and associated
- with an I/O completion port.
-<p>
-<li> Overlapped I/O requests are used to implement read() and write().
-<p>
-<li> If the request cannot be satisified without blocking, the Haskell
- thread is put on the blocked-on-I/O thread list & a re-schedule
- is made.
-<p>
-<li> When the completion of a request is signalled via the I/O completion
- port, the RTS helper thread will move the associated Haskell thread
- from the blocked list onto the runnable list. (Clearly, care
- is required here to have another OS thread mutate internal Scheduler
- data structures.)
-
-<p>
-<li> In the event all Concurrent Haskell threads are blocked waiting on
- I/O, the main RTS thread blocks waiting on an event synchronisation
- object, which the helper thread will signal whenever it makes
- a Haskell thread runnable.
-
-</ul>
-
-I might do the communication between the RTS helper thread and the
-main RTS thread differently though: rather than have the RTS helper
-thread manipluate thread queues itself, thus requiring careful
-locking, just have it change a bit on the relevant TSO, which the main
-RTS thread can check at regular intervals (in some analog of
-awaitEvent(), for example).
-
- <p><small>
-<!-- hhmts start -->
-Last modified: Wed Aug 8 19:30:18 EST 2001
-<!-- hhmts end -->
- </small>
- </body>
-</html>
diff --git a/docs/comm/rts-libs/prelfound.html b/docs/comm/rts-libs/prelfound.html
deleted file mode 100644
index 25407eed43..0000000000
--- a/docs/comm/rts-libs/prelfound.html
+++ /dev/null
@@ -1,57 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - Prelude Foundations</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>The GHC Commentary - Prelude Foundations</h1>
- <p>
- The standard Haskell Prelude as well as GHC's Prelude extensions are
- constructed from GHC's <a href="primitives.html">primitives</a> in a
- couple of layers.
-
- <h4><code>PrelBase.lhs</code></h4>
- <p>
- Some most elementary Prelude definitions are collected in <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelBase.lhs"><code>PrelBase.lhs</code></a>.
- In particular, it defines the boxed versions of Haskell primitive types
- - for example, <code>Int</code> is defined as
- <blockquote><pre>
-data Int = I# Int#</pre>
- </blockquote>
- <p>
- Saying that a boxed integer <code>Int</code> is formed by applying the
- data constructor <code>I#</code> to an <em>unboxed</em> integer of type
- <code>Int#</code>. Unboxed types are hardcoded in the compiler and
- exported together with the <a href="primitives.html">primitive
- operations</a> understood by GHC.
- <p>
- <code>PrelBase.lhs</code> similarly defines basic types, such as,
- boolean values
- <blockquote><pre>
-data Bool = False | True deriving (Eq, Ord)</pre>
- </blockquote>
- <p>
- the unit type
- <blockquote><pre>
-data () = ()</pre>
- </blockquote>
- <p>
- and lists
- <blockquote><pre>
-data [] a = [] | a : [a]</pre>
- </blockquote>
- <p>
- It also contains instance delarations for these types. In addition,
- <code>PrelBase.lhs</code> contains some <a href="prelude.html">tricky
- machinery</a> for efficient list handling.
-
- <p><small>
-<!-- hhmts start -->
-Last modified: Wed Aug 8 19:30:18 EST 2001
-<!-- hhmts end -->
- </small>
- </body>
-</html>
diff --git a/docs/comm/rts-libs/prelude.html b/docs/comm/rts-libs/prelude.html
deleted file mode 100644
index c93e90dddc..0000000000
--- a/docs/comm/rts-libs/prelude.html
+++ /dev/null
@@ -1,121 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - Cunning Prelude Code</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>The GHC Commentary - Cunning Prelude Code</h1>
- <p>
- GHC's uses a many optimisations and GHC specific techniques (unboxed
- values, RULES pragmas, and so on) to make the heavily used Prelude code
- as fast as possible.
-
- <hr>
- <h4>Par, seq, and lazy</h4>
-
- In GHC.Conc you will dinf
-<blockquote><pre>
- pseq a b = a `seq` lazy b
-</pre></blockquote>
- What's this "lazy" thing. Well, <tt>pseq</tt> is a <tt>seq</tt> for a parallel setting.
- We really mean "evaluate a, then b". But if the strictness analyser sees that pseq is strict
- in b, then b might be evaluated <em>before</em> a, which is all wrong.
-<p>
-Solution: wrap the 'b' in a call to <tt>GHC.Base.lazy</tt>. This function is just the identity function,
-except that it's put into the built-in environment in MkId.lhs. That is, the MkId.lhs defn over-rides the
-inlining and strictness information that comes in from GHC.Base.hi. And that makes <tt>lazy</tt> look
-lazy, and have no inlining. So the strictness analyser gets no traction.
-<p>
-In the worker/wrapper phase, after strictness analysis, <tt>lazy</tt> is "manually" inlined (see WorkWrap.lhs),
-so we get all the efficiency back.
-<p>
-This supersedes an earlier scheme involving an even grosser hack in which par# and seq# returned an
-Int#. Now there is no seq# operator at all.
-
-
- <hr>
- <h4>fold/build</h4>
- <p>
- There is a lot of magic in <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelBase.lhs"><code>PrelBase.lhs</code></a> -
- among other things, the <a
- href="http://haskell.cs.yale.edu/ghc/docs/latest/set/rewrite-rules.html">RULES
- pragmas</a> implementing the <a
- href="http://research.microsoft.com/Users/simonpj/Papers/deforestation-short-cut.ps.Z">fold/build</a>
- optimisation. The code for <code>map</code> is
- a good example for how it all works. In the prelude code for version
- 5.03 it reads as follows:
- <blockquote><pre>
-map :: (a -> b) -> [a] -> [b]
-map _ [] = []
-map f (x:xs) = f x : map f xs
-
--- Note eta expanded
-mapFB :: (elt -> lst -> lst) -> (a -> elt) -> a -> lst -> lst
-{-# INLINE [0] mapFB #-}
-mapFB c f x ys = c (f x) ys
-
-{-# RULES
-"map" [~1] forall f xs. map f xs = build (\c n -> foldr (mapFB c f) n xs)
-"mapList" [1] forall f. foldr (mapFB (:) f) [] = map f
-"mapFB" forall c f g. mapFB (mapFB c f) g = mapFB c (f.g)
- #-}</pre>
- </blockquote>
- <p>
- Up to (but not including) phase 1, we use the <code>"map"</code> rule to
- rewrite all saturated applications of <code>map</code> with its
- build/fold form, hoping for fusion to happen. In phase 1 and 0, we
- switch off that rule, inline build, and switch on the
- <code>"mapList"</code> rule, which rewrites the foldr/mapFB thing back
- into plain map.
- <p>
- It's important that these two rules aren't both active at once
- (along with build's unfolding) else we'd get an infinite loop
- in the rules. Hence the activation control using explicit phase numbers.
- <p>
- The "mapFB" rule optimises compositions of map.
- <p>
- The mechanism as described above is new in 5.03 since January 2002,
- where the <code>[~</code><i>N</i><code>]</code> syntax for phase number
- annotations at rules was introduced. Before that the whole arrangement
- was more complicated, as the corresponding prelude code for version
- 4.08.1 shows:
- <blockquote><pre>
-map :: (a -> b) -> [a] -> [b]
-map = mapList
-
--- Note eta expanded
-mapFB :: (elt -> lst -> lst) -> (a -> elt) -> a -> lst -> lst
-mapFB c f x ys = c (f x) ys
-
-mapList :: (a -> b) -> [a] -> [b]
-mapList _ [] = []
-mapList f (x:xs) = f x : mapList f xs
-
-{-# RULES
-"map" forall f xs. map f xs = build (\c n -> foldr (mapFB c f) n xs)
-"mapFB" forall c f g. mapFB (mapFB c f) g = mapFB c (f.g)
-"mapList" forall f. foldr (mapFB (:) f) [] = mapList f
- #-}</pre>
- </blockquote>
- <p>
- This code is structured as it is, because the "map" rule first
- <em>breaks</em> the map <em>open,</em> which exposes it to the various
- foldr/build rules, and if no foldr/build rule matches, the "mapList"
- rule <em>closes</em> it again in a later phase of optimisation - after
- build was inlined. As a consequence, the whole thing depends a bit on
- the timing of the various optimisations (the map might be closed again
- before any of the foldr/build rules fires). To make the timing
- deterministic, <code>build</code> gets a <code>{-# INLINE 2 build
- #-}</code> pragma, which delays <code>build</code>'s inlining, and thus,
- the closing of the map. [NB: Phase numbering was forward at that time.]
-
- <p><small>
-<!-- hhmts start -->
-Last modified: Mon Feb 11 20:00:49 EST 2002
-<!-- hhmts end -->
- </small>
- </body>
-</html>
diff --git a/docs/comm/rts-libs/primitives.html b/docs/comm/rts-libs/primitives.html
deleted file mode 100644
index 28abc79426..0000000000
--- a/docs/comm/rts-libs/primitives.html
+++ /dev/null
@@ -1,70 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - Primitives</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>The GHC Commentary - Primitives</h1>
- <p>
- Most user-level Haskell types and functions provided by GHC (in
- particular those from the Prelude and GHC's Prelude extensions) are
- internally constructed from even more elementary types and functions.
- Most notably, GHC understands a notion of <em>unboxed types,</em> which
- are the Haskell representation of primitive bit-level integer, float,
- etc. types (as opposed to their boxed, heap allocated counterparts) -
- cf. <a
- href="http://research.microsoft.com/Users/simonpj/Papers/unboxed-values.ps.Z">"Unboxed
- Values as First Class Citizens."</a>
-
- <h4>The Ultimate Source of Primitives</h4>
- <p>
- The hardwired types of GHC are brought into scope by the module
- <code>PrelGHC</code>. This modules only exists in the form of a
- handwritten interface file <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelGHC.hi-boot"><code>PrelGHC.hi-boot</code>,</a>
- which lists the type and function names, as well as instance
- declarations. The actually types of these names as well as their
- implementation is hardwired into GHC. Note that the names in this file
- are z-encoded, and in particular, identifiers ending on <code>zh</code>
- denote user-level identifiers ending in a hash mark (<code>#</code>),
- which is used to flag unboxed values or functions operating on unboxed
- values. For example, we have <code>Char#</code>, <code>ord#</code>, and
- so on.
-
- <h4>The New Primitive Definition Scheme</h4>
- <p>
- As of (about) the development version 4.11, the types and various
- properties of primitive operations are defined in the file <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/primops.txt.pp"><code>primops.txt.pp</code></a>.
- (Personally, I don't think that the <code>.txt</code> suffix is really
- appropriate, as the file is used for automatic code generation; the
- recent addition of <code>.pp</code> means that the file is now mangled
- by cpp.)
- <p>
- The utility <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/utils/genprimopcode/"><code>genprimopcode</code></a>
- generates a series of Haskell files from <code>primops.txt</code>, which
- encode the types and various properties of the primitive operations as
- compiler internal data structures. These Haskell files are not complete
- modules, but program fragments, which are included into compiler modules
- during the GHC build process. The generated include files can be found
- in the directory <code>fptools/ghc/compiler/</code> and carry names
- matching the pattern <code>primop-*.hs-incl</code>. They are generate
- during the execution of the <code>boot</code> target in the
- <code>fptools/ghc/</code> directory. This scheme significantly
- simplifies the maintenance of primitive operations.
- <p>
- As of development version 5.02, the <code>primops.txt</code> file also allows the
- recording of documentation about intended semantics of the primitives. This can
- be extracted into a latex document (or rather, into latex document fragments)
- via an appropriate switch to <code>genprimopcode</code>. In particular, see <code>primops.txt</code>
- for full details of how GHC is configured to cope with different machine word sizes.
- <p><small>
-<!-- hhmts start -->
-Last modified: Mon Nov 26 18:03:16 EST 2001
-<!-- hhmts end -->
- </small>
- </body>
-</html>
diff --git a/docs/comm/rts-libs/stgc.html b/docs/comm/rts-libs/stgc.html
deleted file mode 100644
index 196ec9150d..0000000000
--- a/docs/comm/rts-libs/stgc.html
+++ /dev/null
@@ -1,45 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - Spineless Tagless C</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>The GHC Commentary - Spineless Tagless C</h1>
- <p>
- The C code generated by GHC doesn't use higher-level features of C to be
- able to control as precisely as possible what code is generated.
- Moreover, it uses special features of gcc (such as, first class labels)
- to produce more efficient code.
- <p>
- STG C makes ample use of C's macro language to define idioms, which also
- reduces the size of the generated C code (thus, reducing I/O times).
- These macros are defined in the C headers located in GHC's <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/includes/"><code>includes</code></a>
- directory.
-
- <h4><code>TailCalls.h</code></h4>
- <p>
- <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/includes/TailCalls.h"><code>TailCalls.h</code></a>
- defines how tail calls are implemented - and in particular - optimised
- in GHC generated code. The default case, for an architecture for which
- GHC is not optimised, is to use the mini interpreter described in the <a
- href="http://research.microsoft.com/copyright/accept.asp?path=/users/simonpj/papers/spineless-tagless-gmachine.ps.gz&pub=34">STG paper.</a>
- <p>
- For supported architectures, various tricks are used to generate
- assembler implementing proper tail calls. On i386, gcc's first class
- labels are used to directly jump to a function pointer. Furthermore,
- markers of the form <code>--- BEGIN ---</code> and <code>--- END
- ---</code> are added to the assembly right after the function prologue
- and before the epilogue. These markers are used by <a
- href="../the-beast/mangler.html">the Evil Mangler.</a>
-
- <p><small>
-<!-- hhmts start -->
-Last modified: Wed Aug 8 19:28:29 EST 2001
-<!-- hhmts end -->
- </small>
- </body>
-</html>
diff --git a/docs/comm/rts-libs/threaded-rts.html b/docs/comm/rts-libs/threaded-rts.html
deleted file mode 100644
index 739dc8d58a..0000000000
--- a/docs/comm/rts-libs/threaded-rts.html
+++ /dev/null
@@ -1,126 +0,0 @@
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - The Multi-threaded runtime, and multiprocessor execution</title>
- </head>
-
- <body>
- <h1>The GHC Commentary - The Multi-threaded runtime, and multiprocessor execution</h1>
-
- <p>This section of the commentary explains the structure of the runtime system
- when used in threaded or SMP mode.</p>
-
- <p>The <em>threaded</em> version of the runtime supports
- bound threads and non-blocking foreign calls, and an overview of its
- design can be found in the paper <a
- href="http://www.haskell.org/~simonmar/papers/conc-ffi.pdf">Extending
- the Haskell Foreign Function Interface with Concurrency</a>. To
- compile the runtime with threaded support, add the line
-
-<pre>GhcRTSWays += thr</pre>
-
- to <tt>mk/build.mk</tt>. When building C code in the runtime for the threaded way,
- the symbol <tt>THREADED_RTS</tt> is defined (this is arranged by the
- build system when building for way <tt>thr</tt>, see
- <tt>mk/config.mk</tt>). To build a Haskell program
- with the threaded runtime, pass the flag <tt>-threaded</tt> to GHC (this
- can be used in conjunction with <tt>-prof</tt>, and possibly
- <tt>-debug</tt> and others depending on which versions of the RTS have
- been built.</p>
-
- <p>The <em>SMP</em> version runtime supports the same facilities as the
- threaded version, and in addition supports execution of Haskell code by
- multiple simultaneous OS threads. For SMP support, both the runtime and
- the libraries must be built a special way: add the lines
-
- <pre>
-GhcRTSWays += thr
-GhcLibWays += s</pre>
-
- to <tt>mk/build.mk</tt>. To build Haskell code for
- SMP execution, use the flag <tt>-smp</tt> to GHC (this can be used in
- conjunction with <tt>-debug</tt>, but no other way-flags at this time).
- When building C code in the runtime for SMP
- support, the symbol <tt>SMP</tt> is defined (this is arranged by the
- compiler when the <tt>-smp</tt> flag is given, see
- <tt>ghc/compiler/main/StaticFlags.hs</tt>).</p>
-
- <p>When building the runtime in either the threaded or SMP ways, the symbol
- <tt>RTS_SUPPORTS_THREADS</tt> will be defined (see <tt>Rts.h</tt>).</p>
-
- <h2>Overall design</h2>
-
- <p>The system is based around the notion of a <tt>Capability</tt>. A
- <tt>Capability</tt> is an object that represents both the permission to
- execute some Haskell code, and the state required to do so. In order
- to execute some Haskell code, a thread must therefore hold a
- <tt>Capability</tt>. The available pool of capabilities is managed by
- the <tt>Capability</tt> API, described below.</p>
-
- <p>In the threaded runtime, there is only a single <tt>Capability</tt> in the
- system, indicating that only a single thread can be executing Haskell
- code at any one time. In the SMP runtime, there can be an arbitrary
- number of capabilities selectable at runtime with the <tt>+RTS -N<em>n</em></tt>
- flag; in practice the number is best chosen to be the same as the number of
- processors on the host machine.</p>
-
- <p>There are a number of OS threads running code in the runtime. We call
- these <em>tasks</em> to avoid confusion with Haskell <em>threads</em>.
- Tasks are managed by the <tt>Task</tt> subsystem, which is mainly
- concerned with keeping track of statistics such as how much time each
- task spends executing Haskell code, and also keeping track of how many
- tasks are around when we want to shut down the runtime.</p>
-
- <p>Some tasks are created by the runtime itself, and some may be here
- as a result of a call to Haskell from foreign code (we
- call this an in-call). The
- runtime can support any number of concurrent foreign in-calls, but the
- number of these calls that will actually run Haskell code in parallel is
- determined by the number of available capabilities. Each in-call creates
- a <em>bound thread</em>, as described in the FFI/Concurrency paper (cited
- above).</p>
-
- <p>In the future we may want to bind a <tt>Capability</tt> to a particular
- processor, so that we can support a notion of affinity - avoiding
- accidental migration of work from one CPU to another, so that we can make
- best use of a CPU's local cache. For now, the design ignores this
- issue.</p>
-
- <h2>The <tt>OSThreads</tt> interface</h2>
-
- <p>This interface is merely an abstraction layer over the OS-specific APIs
- for managing threads. It has two main implementations: Win32 and
- POSIX.</p>
-
- <p>This is the entirety of the interface:</p>
-
-<pre>
-/* Various abstract types */
-typedef Mutex;
-typedef Condition;
-typedef OSThreadId;
-
-extern OSThreadId osThreadId ( void );
-extern void shutdownThread ( void );
-extern void yieldThread ( void );
-extern int createOSThread ( OSThreadId* tid,
- void (*startProc)(void) );
-
-extern void initCondition ( Condition* pCond );
-extern void closeCondition ( Condition* pCond );
-extern rtsBool broadcastCondition ( Condition* pCond );
-extern rtsBool signalCondition ( Condition* pCond );
-extern rtsBool waitCondition ( Condition* pCond,
- Mutex* pMut );
-
-extern void initMutex ( Mutex* pMut );
- </pre>
-
- <h2>The Task interface</h2>
-
- <h2>The Capability interface</h2>
-
- <h2>Multiprocessor Haskell Execution</h2>
-
- </body>
-</html>
diff --git a/docs/comm/the-beast/alien.html b/docs/comm/the-beast/alien.html
deleted file mode 100644
index 3d4776ebc9..0000000000
--- a/docs/comm/the-beast/alien.html
+++ /dev/null
@@ -1,56 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - Alien Functions</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>The GHC Commentary - Alien Functions</h1>
- <p>
- GHC implements experimental (by now it is actually quite well tested)
- support for access to foreign functions and generally the interaction
- between Haskell code and code written in other languages. Code
- generation in this context can get quite tricky. This section attempts
- to cast some light on this aspect of the compiler.
-
- <h4>FFI Stub Files</h4>
- <p>
- For each Haskell module that contains a <code>foreign export
- dynamic</code> declaration, GHC generates a <code>_stub.c</code> file
- that needs to be linked with any program that imports the Haskell
- module. When asked about it <a
- href="mailto:simonmar@microsoft.com">Simon Marlow</a> justified the
- existence of these files as follows:
- <blockquote>
- The stub files contain the helper function which invokes the Haskell
- code when called from C.
- <p>
- Each time the foreign export dynamic is invoked to create a new
- callback function, a small piece of code has to be dynamically
- generated (by code in <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/rts/Adjustor.c"><code>Adjustor.c</code></a>). It is the address of this dynamically generated bit of
- code that is returned as the <code>Addr</code> (or <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/hslibs/lang/Ptr.lhs"><code>Ptr</code></a>).
- When called from C, the dynamically generated code must somehow invoke
- the Haskell function which was originally passed to the
- f.e.d. function -- it does this by invoking the helper function,
- passing it a <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/hslibs/lang/StablePtr.lhs"><code>StablePtr</code></a>
- to the Haskell function. It's split this way for two reasons: the
- same helper function can be used each time the f.e.d. function is
- called, and to keep the amount of dynamically generated code to a
- minimum.
- </blockquote>
- <p>
- The stub code is generated by <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsForeign.lhs"><code>DSForeign</code></a><code>.fexportEntry</code>.
-
-
- <p><small>
-<!-- hhmts start -->
-Last modified: Fri Aug 10 11:47:41 EST 2001
-<!-- hhmts end -->
- </small>
- </body>
-</html>
diff --git a/docs/comm/the-beast/basicTypes.html b/docs/comm/the-beast/basicTypes.html
deleted file mode 100644
index b411e4c5a9..0000000000
--- a/docs/comm/the-beast/basicTypes.html
+++ /dev/null
@@ -1,132 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - The Basics</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>The GHC Commentary - The Basics</h1>
- <p>
- The directory <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/"><code>fptools/ghc/compiler/basicTypes/</code></a>
- contains modules that define some of the essential types definition for
- the compiler - such as, identifiers, variables, modules, and unique
- names. Some of those are discussed in the following. See elsewhere for more
- detailed information on:
- <ul>
- <li> <a href="vars.html"><code>Var</code>s, <code>Id</code>s, and <code>TyVar</code>s</a>
- <li> <a href="renamer.html"><code>OccName</code>s, <code>RdrName</code>s, and <code>Names</code>s</a>
- </ul>
-
- <h2>Elementary Types</h2>
-
- <h4><code>Id</code>s</h4>
- <p>
- An <code>Id</code> (defined in <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/Id.lhs"><code>Id.lhs</code></a>
- essentially records information about value and data constructor
- identifiers -- to be precise, in the case of data constructors, two
- <code>Id</code>s are used to represent the worker and wrapper functions
- for the data constructor, respectively. The information maintained in
- the <code>Id</code> abstraction includes among other items strictness,
- occurrence, specialisation, and unfolding information.
- <p>
- Due to the way <code>Id</code>s are used for data constructors,
- all <code>Id</code>s are represented as variables, which contain a
- <code>varInfo</code> field of abstract type <code><a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/IdInfo.lhs">IdInfo</a>.IdInfo</code>.
- This is where the information about <code>Id</code>s is really stored.
- The following is a (currently, partial) list of the various items in an
- <code>IdInfo</code>:
- <p>
- <dl>
- <dt><a name="occInfo">Occurrence information</a>
- <dd>The <code>OccInfo</code> data type is defined in the module <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/BasicTypes.lhs"><code>BasicTypes.lhs</code></a>.
- Apart from the trivial <code>NoOccInfo</code>, it distinguishes
- between variables that do not occur at all (<code>IAmDead</code>),
- occur just once (<code>OneOcc</code>), or a <a
- href="simplifier.html#loopBreaker">loop breakers</a>
- (<code>IAmALoopBreaker</code>).
- </dl>
-
- <h2>Sets, Finite Maps, and Environments</h2>
- <p>
- Sets of variables, or more generally names, which are needed throughout
- the compiler, are provided by the modules <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/VarSet.lhs"><code>VarSet.lhs</code></a>
- and <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/NameSet.lhs"><code>NameSet.lhs</code></a>,
- respectively. Moreover, frequently maps from variables (or names) to
- other data is needed. For example, a substitution is represented by a
- finite map from variable names to expressions. Jobs like this are
- solved by means of variable and name environments implemented by the
- modules <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/VarEnv.lhs"><code>VarEnv.lhs</code></a>
- and <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/NameEnv.lhs"><code>NameEnv.lhs</code></a>.
-
- <h4>The Module <code>VarSet</code></h4>
- <p>
- The Module <code>VarSet</code> provides the types <code>VarSet</code>,
- <code>IdSet</code>, and <code>TyVarSet</code>, which are synonyms in the
- current implementation, as <code>Var</code>, <code>Id</code>, and
- <code>TyVar</code> are synonyms. The module provides all the operations
- that one would expect including the creating of sets from individual
- variables and lists of variables, union and intersection operations,
- element checks, deletion, filter, fold, and map functions.
- <p>
- The implementation is based on <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/utils/UniqSet.lhs"><code>UniqSet</code></a>s,
- which in turn are simply <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/utils/UniqFM.lhs"><code>UniqFM</code></a>s
- (i.e., finite maps with uniques as keys) that map each unique to the
- variable that it represents.
-
- <h4>The Module <code>NameSet</code></h4>
- <p>
- The Module <code>NameSet</code> provides the same functionality as
- <code>VarSet</code> only for <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/Name.lhs"><code>Name</code></a>s.
- As for the difference between <code>Name</code>s and <code>Var</code>s,
- a <code>Var</code> is built from a <code>Name</code> plus additional
- information (mostly importantly type information).
-
- <h4>The Module <code>VarEnv</code></h4>
- <p>
- The module <code>VarEnv</code> provides the types <code>VarEnv</code>,
- <code>IdEnv</code>, and <code>TyVarEnv</code>, which are again
- synonyms. The provided base functionality is similar to
- <code>VarSet</code> with the main difference that a type <code>VarEnv
- T</code> associates a value of type <code>T</code> with each variable in
- the environment, thus effectively implementing a finite map from
- variables to values of type <code>T</code>.
- <p>
- The implementation of <code>VarEnv</code> is also by <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/utils/UniqFM.lhs"><code>UniqFM</code></a>,
- which entails the slightly surprising implication that it is
- <em>not</em> possible to retrieve the domain of a variable environment.
- In other words, there is no function corresponding to
- <code>VarSet.varSetElems :: VarSet -> [Var]</code> in
- <code>VarEnv</code>. This is because the <code>UniqFM</code> used to
- implement <code>VarEnv</code> stores only the unique corresponding to a
- variable in the environment, but not the entire variable (and there is
- no mapping from uniques to variables).
- <p>
- In addition to plain variable environments, the module also contains
- special substitution environments - the type <code>SubstEnv</code> -
- that associates variables with a special purpose type
- <code>SubstResult</code>.
-
- <h4>The Module <code>NameEnv</code></h4>
- <p>
- The type <code>NameEnv.NameEnv</code> is like <code>VarEnv</code> only
- for <code>Name</code>s.
-
- <p><hr><small>
-<!-- hhmts start -->
-Last modified: Tue Jan 8 18:29:52 EST 2002
-<!-- hhmts end -->
- </small>
- </body>
-</html>
diff --git a/docs/comm/the-beast/coding-style.html b/docs/comm/the-beast/coding-style.html
deleted file mode 100644
index 41347c6902..0000000000
--- a/docs/comm/the-beast/coding-style.html
+++ /dev/null
@@ -1,230 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - Coding Style Guidelines</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>The GHC Commentary - Coding Style Guidelines</h1>
-
- <p>This is a rough description of some of the coding practices and
- style that we use for Haskell code inside <tt>ghc/compiler</tt>.
-
- <p>The general rule is to stick to the same coding style as is
- already used in the file you're editing. If you must make
- stylistic changes, commit them separately from functional changes,
- so that someone looking back through the change logs can easily
- distinguish them.
-
- <h2>To literate or not to literate?</h2>
-
- <p>In GHC we use a mixture of literate (<tt>.lhs</tt>) and
- non-literate (<tt>.hs</tt>) source. I (Simon M.) prefer to use
- non-literate style, because I think the
- <tt>\begin{code}..\end{code}</tt> clutter up the source too much,
- and I like to use Haddock-style comments (we haven't tried
- processing the whole of GHC with Haddock yet, though).
-
- <h2>To CPP or not to CPP?</h2>
-
- <p>We pass all the compiler sources through CPP. The
- <tt>-cpp</tt> flag is always added by the build system.
-
- <p>The following CPP symbols are used throughout the compiler:
-
- <dl>
- <dt><tt>DEBUG</tt></dt>
-
- <dd>Used to enables extra checks and debugging output in the
- compiler. The <tt>ASSERT</tt> macro (see <tt>HsVersions.h</tt>)
- provides assertions which disappear when <tt>DEBUG</tt> is not
- defined.
-
- <p>All debugging output should be placed inside <tt>#ifdef
- DEBUG</tt>; we generally use this to provide warnings about
- strange cases and things that might warrant investigation. When
- <tt>DEBUG</tt> is off, the compiler should normally be silent
- unless something goes wrong (exception when the verbosity level
- is greater than zero).
-
- <p>A good rule of thumb is that <tt>DEBUG</tt> shouldn't add
- more than about 10-20% to the compilation time. This is the case
- at the moment. If it gets too expensive, we won't use it. For
- more expensive runtime checks, consider adding a flag - see for
- example <tt>-dcore-lint</tt>.
- </dd>
-
- <dt><tt>GHCI</tt></dt>
-
- <dd>Enables GHCi support, including the byte code generator and
- interactive user interface. This isn't the default, because the
- compiler needs to be bootstrapped with itself in order for GHCi
- to work properly. The reason is that the byte-code compiler and
- linker are quite closely tied to the runtime system, so it is
- essential that GHCi is linked with the most up-to-date RTS.
- Another reason is that the representation of certain datatypes
- must be consistent between GHCi and its libraries, and if these
- were inconsistent then disaster could follow.
- </dd>
-
- </dl>
-
- <h2>Platform tests</h2>
-
- <p>There are three platforms of interest to GHC:
-
- <ul>
- <li>The <b>Build</b> platform. This is the platform on which we
- are building GHC.</li>
- <li>The <b>Host</b> platform. This is the platform on which we
- are going to run this GHC binary, and associated tools.</li>
- <li>The <b>Target</b> platform. This is the platform for which
- this GHC binary will generate code.</li>
- </ul>
-
- <p>At the moment, there is very limited support for having
- different values for buil, host, and target. In particular:</p>
-
- <ul>
- <li>The build platform is currently always the same as the host
- platform. The build process needs to use some of the tools in
- the source tree, for example <tt>ghc-pkg</tt> and
- <tt>hsc2hs</tt>.</li>
-
- <li>If the target platform differs from the host platform, then
- this is generally for the purpose of building <tt>.hc</tt> files
- from Haskell source for porting GHC to the target platform.
- Full cross-compilation isn't supported (yet).</li>
- </ul>
-
- <p>In the compiler's source code, you may make use of the
- following CPP symbols:</p>
-
- <ul>
- <li><em>xxx</em><tt>_TARGET_ARCH</tt></li>
- <li><em>xxx</em><tt>_TARGET_VENDOR</tt></li>
- <li><em>xxx</em><tt>_TARGET_OS</tt></li>
- <li><em>xxx</em><tt>_HOST_ARCH</tt></li>
- <li><em>xxx</em><tt>_HOST_VENDOR</tt></li>
- <li><em>xxx</em><tt>_HOST_OS</tt></li>
- </ul>
-
- <p>where <em>xxx</em> is the appropriate value:
- eg. <tt>i386_TARGET_ARCH</tt>.
-
- <h2>Compiler versions</h2>
-
- <p>GHC must be compilable by every major version of GHC from 5.02
- onwards, and itself. It isn't necessary for it to be compilable
- by every intermediate development version (that includes last
- week's CVS sources).
-
- <p>To maintain compatibility, use <tt>HsVersions.h</tt> (see
- below) where possible, and try to avoid using <tt>#ifdef</tt> in
- the source itself.
-
- <h2>The source file</h2>
-
- <p>We now describe a typical source file, annotating stylistic
- choices as we go.
-
-<pre>
-{-# OPTIONS ... #-}
-</pre>
-
- <p>An <tt>OPTIONS</tt> pragma is optional, but if present it
- should go right at the top of the file. Things you might want to
- put in <tt>OPTIONS</tt> include:
-
- <ul>
- <li><tt>-#include</tt> options to bring into scope prototypes
- for FFI declarations</li>
- <li><tt>-fvia-C</tt> if you know that
- this module won't compile with the native code generator.
- </ul>
-
- <p>Don't bother putting <tt>-cpp</tt> or <tt>-fglasgow-exts</tt>
- in the <tt>OPTIONS</tt> pragma; these are already added to the
- command line by the build system.
-
-
-<pre>
-module Foo (
- T(..),
- foo, -- :: T -> T
- ) where
-</pre>
-
- <p>We usually (99% of the time) include an export list. The only
- exceptions are perhaps where the export list would list absolutely
- everything in the module, and even then sometimes we do it anyway.
-
- <p>It's helpful to give type signatures inside comments in the
- export list, but hard to keep them consistent, so we don't always
- do that.
-
-<pre>
-#include "HsVersions.h"
-</pre>
-
- <p><tt>HsVersions.h</tt> is a CPP header file containing a number
- of macros that help smooth out the differences between compiler
- versions. It defines, for example, macros for library module
- names which have moved between versions. Take a look.
-
-<pre>
--- friends
-import SimplMonad
-
--- GHC
-import CoreSyn
-import Id ( idName, idType )
-import BasicTypes
-
--- libraries
-import DATA_IOREF ( newIORef, readIORef )
-
--- std
-import List ( partition )
-import Maybe ( fromJust )
-</pre>
-
- <p>List imports in the following order:
-
- <ul>
- <li>Local to this subsystem (or directory) first</li>
-
- <li>Compiler imports, generally ordered from specific to generic
- (ie. modules from <tt>utils/</tt> and <tt>basicTypes/</tt>
- usually come last)</li>
-
- <li>Library imports</li>
-
- <li>Standard Haskell 98 imports last</li>
- </ul>
-
- <p>Import library modules from the <tt>base</tt> and
- <tt>haskell98</tt> packages only. Use <tt>#defines</tt> in
- <tt>HsVersions.h</tt> when the modules names differ between
- versions of GHC (eg. <tt>DATA_IOREF</tt> in the example above).
- For code inside <tt>#ifdef GHCI</tt>, don't need to worry about GHC
- versioning (because we are bootstrapped).
-
- <p>We usually use import specs to give an explicit list of the
- entities imported from a module. The main reason for doing this is
- so that you can search the file for an entity and find which module
- it comes from. However, huge import lists can be a pain to
- maintain, so we often omit the import specs when they start to get
- long (actually I start omitting them when they don't fit on one
- line --Simon M.). Tip: use GHC's <tt>-fwarn-unused-imports</tt>
- flag so that you get notified when an import isn't being used any
- more.
-
- <p>If the module can be compiled multiple ways (eg. GHCI
- vs. non-GHCI), make sure the imports are properly <tt>#ifdefed</tt>
- too, so as to avoid spurious unused import warnings.
-
- <p><em>ToDo: finish this</em>
- </body>
-</html>
diff --git a/docs/comm/the-beast/data-types.html b/docs/comm/the-beast/data-types.html
deleted file mode 100644
index 4ec220c937..0000000000
--- a/docs/comm/the-beast/data-types.html
+++ /dev/null
@@ -1,242 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - Data types and data constructors</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>The GHC Commentary - Data types and data constructors</h1>
- <p>
-
-This chapter was thoroughly changed Feb 2003.
-
-<h2>Data types</h2>
-
-Consider the following data type declaration:
-
-<pre>
- data T a = MkT !(a,a) !(T a) | Nil
-
- f x = case x of
- MkT p q -> MkT p (q+1)
- Nil -> Nil
-</pre>
-The user's source program mentions only the constructors <tt>MkT</tt>
-and <tt>Nil</tt>. However, these constructors actually <em>do</em> something
-in addition to building a data value. For a start, <tt>MkT</tt> evaluates
-its arguments. Secondly, with the flag <tt>-funbox-strict-fields</tt> GHC
-will flatten (or unbox) the strict fields. So we may imagine that there's the
-<em>source</em> constructor <tt>MkT</tt> and the <em>representation</em> constructor
-<tt>MkT</tt>, and things start to get pretty confusing.
-<p>
-GHC now generates three unique <tt>Name</tt>s for each data constructor:
-<pre>
- ---- OccName ------
- String Name space Used for
- ---------------------------------------------------------------------------
- The "source data con" MkT DataName The DataCon itself
- The "worker data con" MkT VarName Its worker Id
- aka "representation data con"
- The "wrapper data con" $WMkT VarName Its wrapper Id (optional)
-</pre>
-Recall that each occurrence name (OccName) is a pair of a string and a
-name space (see <a href="names.html">The truth about names</a>), and
-two OccNames are considered the same only if both components match.
-That is what distinguishes the name of the name of the DataCon from
-the name of its worker Id. To keep things unambiguous, in what
-follows we'll write "MkT{d}" for the source data con, and "MkT{v}" for
-the worker Id. (Indeed, when you dump stuff with "-ddumpXXX", if you
-also add "-dppr-debug" you'll get stuff like "Foo {- d rMv -}". The
-"d" part is the name space; the "rMv" is the unique key.)
-<p>
-Each of these three names gets a distinct unique key in GHC's name cache.
-
-<h2>The life cycle of a data type</h2>
-
-Suppose the Haskell source looks like this:
-<pre>
- data T a = MkT !(a,a) !Int | Nil
-
- f x = case x of
- Nil -> Nil
- MkT p q -> MkT p (q+1)
-</pre>
-When the parser reads it in, it decides which name space each lexeme comes
-from, thus:
-<pre>
- data T a = MkT{d} !(a,a) !Int | Nil{d}
-
- f x = case x of
- Nil{d} -> Nil{d}
- MkT{d} p q -> MkT{d} p (q+1)
-</pre>
-Notice that in the Haskell source <em>all data contructors are named via the "source data con" MkT{d}</em>,
-whether in pattern matching or in expressions.
-<p>
-In the translated source produced by the type checker (-ddump-tc), the program looks like this:
-<pre>
- f x = case x of
- Nil{d} -> Nil{v}
- MkT{d} p q -> $WMkT p (q+1)
-
-</pre>
-Notice that the type checker replaces the occurrence of MkT by the <em>wrapper</em>, but
-the occurrence of Nil by the <em>worker</em>. Reason: Nil doesn't have a wrapper because there is
-nothing to do in the wrapper (this is the vastly common case).
-<p>
-Though they are not printed out by "-ddump-tc", behind the scenes, there are
-also the following: the data type declaration and the wrapper function for MkT.
-<pre>
- data T a = MkT{d} a a Int# | Nil{d}
-
- $WMkT :: (a,a) -> T a -> T a
- $WMkT p t = case p of
- (a,b) -> seq t (MkT{v} a b t)
-</pre>
-Here, the <em>wrapper</em> <tt>$WMkT</tt> evaluates and takes apart the argument <tt>p</tt>,
-evaluates the argument <tt>t</tt>, and builds a three-field data value
-with the <em>worker</em> constructor <tt>MkT{v}</tt>. (There are more notes below
-about the unboxing of strict fields.) The worker $WMkT is called an <em>implicit binding</em>,
-because it's introduced implicitly by the data type declaration (record selectors
-are also implicit bindings, for example). Implicit bindings are injected into the code
-just before emitting code or External Core.
-<p>
-After desugaring into Core (-ddump-ds), the definition of <tt>f</tt> looks like this:
-<pre>
- f x = case x of
- Nil{d} -> Nil{v}
- MkT{d} a b r -> let { p = (a,b); q = I# r } in
- $WMkT p (q+1)
-</pre>
-Notice the way that pattern matching has been desugared to take account of the fact
-that the "real" data constructor MkT has three fields.
-<p>
-By the time the simplifier has had a go at it, <tt>f</tt> will be transformed to:
-<pre>
- f x = case x of
- Nil{d} -> Nil{v}
- MkT{d} a b r -> MkT{v} a b (r +# 1#)
-</pre>
-Which is highly cool.
-
-
-<h2> The constructor wrapper functions </h2>
-
-The wrapper functions are automatically generated by GHC, and are
-really emitted into the result code (albeit only after CorePre; see
-<tt>CorePrep.mkImplicitBinds</tt>).
-The wrapper functions are inlined very
-vigorously, so you will not see many occurrences of the wrapper
-functions in an optimised program, but you may see some. For example,
-if your Haskell source has
-<pre>
- map MkT xs
-</pre>
-then <tt>$WMkT</tt> will not be inlined (because it is not applied to anything).
-That is why we generate real top-level bindings for the wrapper functions,
-and generate code for them.
-
-
-<h2> The constructor worker functions </h2>
-
-Saturated applications of the constructor worker function MkT{v} are
-treated specially by the code generator; they really do allocation.
-However, we do want a single, shared, top-level definition for
-top-level nullary constructors (like True and False). Furthermore,
-what if the code generator encounters a non-saturated application of a
-worker? E.g. <tt>(map Just xs)</tt>. We could declare that to be an
-error (CorePrep should saturate them). But instead we currently
-generate a top-level definition for each constructor worker, whether
-nullary or not. It takes the form:
-<pre>
- MkT{v} = \ p q r -> MkT{v} p q r
-</pre>
-This is a real hack. The occurrence on the RHS is saturated, so the code generator (both the
-one that generates abstract C and the byte-code generator) treats it as a special case and
-allocates a MkT; it does not make a recursive call! So now there's a top-level curried
-version of the worker which is available to anyone who wants it.
-<p>
-This strange definition is not emitted into External Core. Indeed, you might argue that
-we should instead pass the list of <tt>TyCon</tt>s to the code generator and have it
-generate magic bindings directly. As it stands, it's a real hack: see the code in
-CorePrep.mkImplicitBinds.
-
-
-<h2> External Core </h2>
-
-When emitting External Core, we should see this for our running example:
-
-<pre>
- data T a = MkT a a Int# | Nil{d}
-
- $WMkT :: (a,a) -> T a -> T a
- $WMkT p t = case p of
- (a,b) -> seq t (MkT a b t)
-
- f x = case x of
- Nil -> Nil
- MkT a b r -> MkT a b (r +# 1#)
-</pre>
-Notice that it makes perfect sense as a program all by itself. Constructors
-look like constructors (albeit not identical to the original Haskell ones).
-<p>
-When reading in External Core, the parser is careful to read it back in just
-as it was before it was spat out, namely:
-<pre>
- data T a = MkT{d} a a Int# | Nil{d}
-
- $WMkT :: (a,a) -> T a -> T a
- $WMkT p t = case p of
- (a,b) -> seq t (MkT{v} a b t)
-
- f x = case x of
- Nil{d} -> Nil{v}
- MkT{d} a b r -> MkT{v} a b (r +# 1#)
-</pre>
-
-
-<h2> Unboxing strict fields </h2>
-
-If GHC unboxes strict fields (as in the first argument of <tt>MkT</tt> above),
-it also transforms
-source-language case expressions. Suppose you write this in your Haskell source:
-<pre>
- case e of
- MkT p t -> ..p..t..
-</pre>
-GHC will desugar this to the following Core code:
-<pre>
- case e of
- MkT a b t -> let p = (a,b) in ..p..t..
-</pre>
-The local let-binding reboxes the pair because it may be mentioned in
-the case alternative. This may well be a bad idea, which is why
-<tt>-funbox-strict-fields</tt> is an experimental feature.
-<p>
-It's essential that when importing a type <tt>T</tt> defined in some
-external module <tt>M</tt>, GHC knows what representation was used for
-that type, and that in turn depends on whether module <tt>M</tt> was
-compiled with <tt>-funbox-strict-fields</tt>. So when writing an
-interface file, GHC therefore records with each data type whether its
-strict fields (if any) should be unboxed.
-
-<h2> Labels and info tables </h2>
-
-<em>Quick rough notes: SLPJ March 2003</em>.
-<p>
-Every data constructor <tt>C</tt>has two info tables:
-<ul>
-<li> The static info table (label <tt>C_static_info</tt>), used for statically-allocated constructors.
-
-<li> The dynamic info table (label <tt>C_con_info</tt>), used for dynamically-allocated constructors.
-</ul>
-Statically-allocated constructors are not moved by the garbage collector, and therefore have a different closure
-type from dynamically-allocated constructors; hence they need
-a distinct info table.
-Both info tables share the same entry code, but since the entry code is phyiscally juxtaposed with the
-info table, it must be duplicated (<tt>C_static_entry</tt> and <tt>C_con_entry</tt> respectively).
-
- </body>
-</html>
-
diff --git a/docs/comm/the-beast/desugar.html b/docs/comm/the-beast/desugar.html
deleted file mode 100644
index a66740259b..0000000000
--- a/docs/comm/the-beast/desugar.html
+++ /dev/null
@@ -1,156 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - Sugar Free: From Haskell To Core</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>The GHC Commentary - Sugar Free: From Haskell To Core</h1>
- <p>
- Up until after type checking, GHC keeps the source program in an
- abstract representation of Haskell source without removing any of the
- syntactic sugar (such as, list comprehensions) that could easily be
- represented by more primitive Haskell. This complicates part of the
- front-end considerably as the abstract syntax of Haskell (as exported by
- the module <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/hsSyn/HsSyn.lhs"><code>HsSyn</code></a>)
- is much more complex than a simplified representation close to, say, the
- <a href="http://haskell.org/onlinereport/intro.html#sect1.2">Haskell
- Kernel</a> would be. However, having a representation that is as close
- as possible to the surface syntax simplifies the generation of clear
- error messages. As GHC (quite in contrast to "conventional" compilers)
- prints code fragments as part of error messages, the choice of
- representation is especially important.
- <p>
- Nonetheless, as soon as the input has passed all static checks, it is
- transformed into GHC's principal intermediate language that goes by the
- name of <em>Core</em> and whose representation is exported by the
- module <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/coreSyn/CoreSyn.lhs"><code>CoreSyn</code></a>.
- All following compiler phases, except code generation operate on Core.
- Due to Andrew Tolmach's effort, there is also an <a
- href="http://www.haskell.org/ghc/docs/papers/core.ps.gz">external
- representation for Core.</a>
- <p>
- The conversion of the compiled module from <code>HsSyn</code> into that
- of <code>CoreSyn</code> is performed by a phase called the
- <em>desugarer</em>, which is located in
- <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/"><code>fptools/ghc/compiler/deSugar/</code></a>.
- It's operation is detailed in the following.
- </p>
-
- <h2>Auxilliary Functions</h2>
- <p>
- The modules <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsMonad.lhs"><code>DsMonad</code></a>
- defines the desugarer monad (of type <code>DsM</code>) which maintains
- the environment needed for desugaring. In particular, it encapsulates a
- unique supply for generating new variables, a map to lookup standard
- names (such as functions from the prelude), a source location for error
- messages, and a pool to collect warning messages generated during
- desugaring. Initialisation of the environment happens in the function <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Desugar.lhs"><code>Desugar</code></a><code>.desugar</code>,
- which is also the main entry point into the desugarer.
- <p>
- The generation of Core code often involves the use of standard functions
- for which proper identifiers (i.e., values of type <code>Id</code> that
- actually refer to the definition in the right Prelude) need to be
- obtained. This is supported by the function
- <code>DsMonad.dsLookupGlobalValue :: Name -> DsM Id</code>.
-
- <h2><a name="patmat">Pattern Matching</a></h2>
- <p>
- Nested pattern matching with guards and everything is translated into
- the simple, flat case expressions of Core by the following modules:
- <dl>
- <dt><a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Match.lhs"><code>Match</code></a>:
- <dd>This modules contains the main pattern-matching compiler in the form
- of a function called <code>match</code>. There is some documentation
- as to how <code>match</code> works contained in the module itself.
- Generally, the implemented algorithm is similar to the one described
- in Phil Wadler's Chapter ? of Simon Peyton Jones' <em>The
- Implementation of Functional Programming Languages</em>.
- <code>Match</code> exports a couple of functions with not really
- intuitive names. In particular, it exports <code>match</code>,
- <code>matchWrapper</code>, <code>matchExport</code>, and
- <code>matchSimply</code>. The function <code>match</code>, which is
- the main work horse, is only used by the other matching modules. The
- function <code>matchExport</code> - despite it's name - is merely used
- internally in <code>Match</code> and handles warning messages (see
- below for more details). The actual interface to the outside is
- <code>matchWrapper</code>, which converts the output of the type
- checker into the form needed by the pattern matching compiler (i.e., a
- list of <code>EquationInfo</code>). Similar in function to
- <code>matchWrapper</code> is <code>matchSimply</code>, which provides
- an interface for the case where a single expression is to be matched
- against a single pattern (as, for example, is the case in bindings in
- a <code>do</code> expression).
- <dt><a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/MatchCon.lhs"><code>MatchCon</code></a>:
- <dd>This module generates code for a set of alternative constructor
- patterns that belong to a single type by means of the routine
- <code>matchConFamily</code>. More precisely, the routine gets a set
- of equations where the left-most pattern of each equation is a
- constructor pattern with a head symbol from the same type as that of
- all the other equations. A Core case expression is generated that
- distinguihes between all these constructors. The routine is clever
- enough to generate a sparse case expression and to add a catch-all
- default case only when needed (i.e., if the case expression isn't
- exhaustive already). There is also an explanation at the start of the
- modules.
- <dt><a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/MatchLit.lhs"><code>MatchLit</code></a>:
- <dd>Generates code for a set of alternative literal patterns by means of
- the routine <code>matchLiterals</code>. The principle is similar to
- that of <code>matchConFamily</code>, but all left-most patterns are
- literals of the same type.
- <dt><a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsUtils.lhs"><code>DsUtils</code></a>:
- <dd>This module provides a set of auxilliary definitions as well as the
- data types <code>EquationInfo</code> and <code>MatchResult</code> that
- form the input and output, respectively, of the pattern matching
- compiler.
- <dt><a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Check.lhs"><code>Check</code></a>:
- <dd>This module does not really contribute the compiling pattern
- matching, but it inspects sets of equations to find whether there are
- any overlapping patterns or non-exhaustive pattern sets. This task is
- implemented by the function <code>check</code>, which returns a list of
- patterns that are part of a non-exhaustive case distinction as well as a
- set of equation labels that can be reached during execution of the code;
- thus, the remaining equations are shadowed due to overlapping patterns.
- The function <code>check</code> is invoked and its result converted into
- suitable warning messages by the function <code>Match.matchExport</code>
- (which is a wrapper for <code>Match.match</code>).
- </dl>
- <p>
- The central function <code>match</code>, given a set of equations,
- proceeds in a number of steps:
- <ol>
- <li>It starts by desugaring the left-most pattern of each equation using
- the function <code>tidy1</code> (indirectly via
- <code>tidyEqnInfo</code>). During this process, non-elementary
- pattern (e.g., those using explicit list syntax <code>[x, y, ...,
- z]</code>) are converted to a standard constructor pattern and also
- irrefutable pattern are removed.
- <li>Then, a process called <em>unmixing</em> clusters the equations into
- blocks (without re-ordering them), such that the left-most pattern of
- all equations in a block are either all variables, all literals, or
- all constructors.
- <li>Each block is, then, compiled by <code>matchUnmixedEqns</code>,
- which forwards the handling of literal pattern blocks to
- <code>MatchLit.matchLiterals</code>, of constructor pattern blocks to
- <code>MatchCon.matchConFamily</code>, and hands variable pattern
- blocks back to <code>match</code>.
- </ol>
-
- <p><hr><small>
-<!-- hhmts start -->
-Last modified: Mon Feb 11 22:35:25 EST 2002
-<!-- hhmts end -->
- </small>
- </body>
-</html>
diff --git a/docs/comm/the-beast/driver.html b/docs/comm/the-beast/driver.html
deleted file mode 100644
index fbf65e33e7..0000000000
--- a/docs/comm/the-beast/driver.html
+++ /dev/null
@@ -1,179 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - The Glorious Driver</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>The GHC Commentary - The Glorious Driver</h1>
- <p>
- The Glorious Driver (GD) is the part of GHC that orchestrates the
- interaction of all the other pieces that make up GHC. It supersedes the
- <em>Evil Driver (ED),</em> which was a Perl script that served the same
- purpose and was in use until version 4.08.1 of GHC. Simon Marlow
- eventually slayed the ED and instated the GD. The GD is usually called
- the <em>Compilation Manager</em> these days.
- </p>
- <p>
- The GD has been substantially extended for GHCi, i.e., the interactive
- variant of GHC that integrates the compiler with a (meta-circular)
- interpreter since version 5.00. Most of the driver is located in the
- directory
- <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/"><code>fptools/ghc/compiler/main/</code></a>.
- </p>
-
- <h2>Command Line Options</h2>
- <p>
- GHC's many flavours of command line options make the code interpreting
- them rather involved. The following provides a brief overview of the
- processing of these options. Since the addition of the interactive
- front-end to GHC, there are two kinds of options: <em>static
- options</em> and <em>dynamic options.</em> The former can only be set
- when the system is invoked, whereas the latter can be altered in the
- course of an interactive session. A brief explanation on the difference
- between these options and related matters is at the start of the module
- <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/CmdLineOpts.lhs"><code>CmdLineOpts</code></a>.
- The same module defines the enumeration <code>DynFlag</code>, which
- contains all dynamic flags. Moreover, there is the labelled record
- <code>DynFlags</code> that collects all the flag-related information
- that is passed by the compilation manager to the compiler proper,
- <code>hsc</code>, whenever a compilation is triggered. If you like to
- find out whether an option is static, use the predicate
- <code>isStaticHscFlag</code> in the same module.
- <p>
- The second module that contains a lot of code related to the management
- of flags is <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/DriverFlags.hs"><code>DriverFlags.hs</code></a>.
- In particular, the module contains two association lists that map the
- textual representation of the various flags to a data structure that
- tells the driver how to parse the flag (e.g., whether it has any
- arguments) and provides its internal representation. All static flags
- are contained in <code>static_flags</code>. A whole range of
- <code>-f</code> flags can be negated by adding a <code>-f-no-</code>
- prefix. These flags are contained in the association list
- <code>fFlags</code>.
- <p>
- The driver uses a nasty hack based on <code>IORef</code>s that permits
- the rest of the compiler to access static flags as CAFs; i.e., there is
- a family of toplevel variable definitions in
- <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/CmdLineOpts.lhs"><code>CmdLineOpts</code></a>,
- below the literate section heading <i>Static options</i>, each of which
- contains the value of one static option. This is essentially realised
- via global variables (in the sense of C-style, updatable, global
- variables) defined via an evil pre-processor macro named
- <code>GLOBAL_VAR</code>, which is defined in a particularly ugly corner
- of GHC, namely the C header file
- <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/HsVersions.h"><code>HsVersions.h</code></a>.
-
- <h2>What Happens When</h2>
- <p>
- Inside the Haskell compiler proper (<code>hsc</code>), a whole series of
- stages (``passes'') are executed in order to transform your Haskell program
- into C or native code. This process is orchestrated by
- <code>main/HscMain.hscMain</code> and its relative
- <code>hscReComp</code>. The latter directly invokes, in order,
- the parser, the renamer, the typechecker, the desugarer, the
- simplifier (Core2Core), the CoreTidy pass, the CorePrep pass,
- conversion to STG (CoreToStg), the interface generator
- (MkFinalIface), the code generator, and code output. The
- simplifier is the most complex of these, and is made up of many
- sub-passes. These are controlled by <code>buildCoreToDo</code>,
- as described below.
-
- <h2>Scheduling Optimisations Phases</h2>
- <p>
- GHC has a large variety of optimisations at its disposal, many of which
- have subtle interdependencies. The overall plan for program
- optimisation is fixed in <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/DriverState.hs"><code>DriverState.hs</code></a>.
- First of all, there is the variable <code>hsc_minusNoO_flags</code> that
- determines the <code>-f</code> options that you get without
- <code>-O</code> (aka optimisation level 0) as well as
- <code>hsc_minusO_flags</code> and <code>hsc_minusO2_flags</code> for
- <code>-O</code> and <code>-O2</code>.
- <p>
- However, most of the strategic decisions about optimisations on the
- intermediate language Core are encoded in the value produced by
- <code>buildCoreToDo</code>, which is a list with elements of type
- <code>CoreToDo</code>. Each element of this list specifies one step in
- the sequence of core optimisations executed by the <a
- href="simplifier.html">Mighty Simplifier</a>. The type
- <code>CoreToDo</code> is defined in <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/CmdLineOpts.lhs"><code>CmdLineOpts.lhs</code></a>.
- The actual execution of the optimisation plan produced by
- <code>buildCoreToDo</code> is performed by <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/simplCore/SimplCore.lhs"><code>SimpleCore</code></a><code>.doCorePasses</code>.
- Core optimisation plans consist of a number of simplification phases
- (currently, three for optimisation levels of 1 or higher) with
- decreasing phase numbers (the lowest, corresponding to the last phase,
- namely 0). Before and after these phases, optimisations such as
- specialisation, let floating, worker/wrapper, and so on are executed.
- The sequence of phases is such that the synergistic effect of the phases
- is maximised -- however, this is a fairly fragile arrangement.
- <p>
- There is a similar construction for optimisations on STG level stored in
- the variable <code>buildStgToDo :: [StgToDo]</code>. However, this is a
- lot less complex than the arrangement for Core optimisations.
-
- <h2>Linking the <code>RTS</code> and <code>libHSstd</code></h2>
- <p>
- Since the RTS and HSstd refer to each other, there is a Cunning
- Hack to avoid putting them each on the command-line twice or
- thrice (aside: try asking for `plaice and chips thrice' in a
- fish and chip shop; bet you only get two lots). The hack involves
- adding
- the symbols that the RTS needs from libHSstd, such as
- <code>PrelWeak_runFinalizzerBatch_closure</code> and
- <code>__stginit_Prelude</code>, to the link line with the
- <code>-u</code> flag. The standard library appears before the
- RTS on the link line, and these options cause the corresponding
- symbols to be picked up even so the linked might not have seen them
- being used as the RTS appears later on the link line. As a result,
- when the RTS is also scanned, these symbols are already resolved. This
- avoids the linker having to read the standard library and RTS
- multiple times.
- </p>
- <p>
- This does, however, leads to a complication. Normal Haskell
- programs do not have a <code>main()</code> function, so this is
- supplied by the RTS (in the file
- <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/rts/Main.c"><code>Main.c</code></a>).
- It calls <code>startupHaskell</code>, which
- itself calls <code>__stginit_PrelMain</code>, which is therefore,
- since it occurs in the standard library, one of the symbols
- passed to the linker using the <code>-u</code> option. This is fine
- for standalone Haskell programs, but as soon as the Haskell code is only
- used as part of a program implemented in a foreign language, the
- <code>main()</code> function of that foreign language should be used
- instead of that of the Haskell runtime. In this case, the previously
- described arrangement unfortunately fails as
- <code>__stginit_PrelMain</code> had better not be linked in,
- because it tries to call <code>__stginit_Main</code>, which won't
- exist. In other words, the RTS's <code>main()</code> refers to
- <code>__stginit_PrelMain</code> which in turn refers to
- <code>__stginit_Main</code>. Although the RTS's <code>main()</code>
- might not be linked in if the program provides its own, the driver
- will normally force <code>__stginit_PrelMain</code> to be linked in anyway,
- using <code>-u</code>, because it's a back-reference from the
- RTS to HSstd. This case is coped with by the <code>-no-hs-main</code>
- flag, which suppresses passing the corresonding <code>-u</code> option
- to the linker -- although in some versions of the compiler (e.g., 5.00.2)
- it didn't work. In addition, the driver generally places the C program
- providing the <code>main()</code> that we want to use before the RTS
- on the link line. Therefore, the RTS's main is never used and
- without the <code>-u</code> the label <code>__stginit_PrelMain</code>
- will not be linked.
- </p>
-
- <p><small>
-<!-- hhmts start -->
-Last modified: Tue Feb 19 11:09:00 UTC 2002
-<!-- hhmts end -->
- </small>
- </body>
-</html>
diff --git a/docs/comm/the-beast/fexport.html b/docs/comm/the-beast/fexport.html
deleted file mode 100644
index 956043bafb..0000000000
--- a/docs/comm/the-beast/fexport.html
+++ /dev/null
@@ -1,231 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - foreign export</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>The GHC Commentary - foreign export</h1>
-
- The implementation scheme for foreign export, as of 27 Feb 02, is
- as follows. There are four cases, of which the first two are easy.
- <p>
- <b>(1) static export of an IO-typed function from some module <code>MMM</code></b>
- <p>
- <code>foreign export foo :: Int -> Int -> IO Int</code>
- <p>
- For this we generate no Haskell code. However, a C stub is
- generated, and it looks like this:
- <p>
- <pre>
-extern StgClosure* MMM_foo_closure;
-
-HsInt foo (HsInt a1, HsInt a2)
-{
- SchedulerStatus rc;
- HaskellObj ret;
- rc = rts_evalIO(
- rts_apply(rts_apply(MMM_foo_closure,rts_mkInt(a1)),
- rts_mkInt(a2)
- ),
- &ret
- );
- rts_checkSchedStatus("foo",rc);
- return(rts_getInt(ret));
-}
-</pre>
- <p>
- This does the obvious thing: builds in the heap the expression
- <code>(foo a1 a2)</code>, calls <code>rts_evalIO</code> to run it,
- and uses <code>rts_getInt</code> to fish out the result.
-
- <p>
- <b>(2) static export of a non-IO-typed function from some module <code>MMM</code></b>
- <p>
- <code>foreign export foo :: Int -> Int -> Int</code>
- <p>
- This is identical to case (1), with the sole difference that the
- stub calls <code>rts_eval</code> rather than
- <code>rts_evalIO</code>.
- <p>
-
- <b>(3) dynamic export of an IO-typed function from some module <code>MMM</code></b>
- <p>
- <code>foreign export mkCallback :: (Int -> Int -> IO Int) -> IO (FunPtr a)</code>
- <p>
- Dynamic exports are a whole lot more complicated than their static
- counterparts.
- <p>
- First of all, we get some Haskell code, which, when given a
- function <code>callMe :: (Int -> Int -> IO Int)</code> to be made
- C-callable, IO-returns a <code>FunPtr a</code>, which is the
- address of the resulting C-callable code. This address can now be
- handed out to the C-world, and callers to it will get routed
- through to <code>callMe</code>.
- <p>
- The generated Haskell function looks like this:
- <p>
-<pre>
-mkCallback f
- = do sp <- mkStablePtr f
- r <- ccall "createAdjustorThunk" sp (&"run_mkCallback")
- return r
-</pre>
- <p>
- <code>createAdjustorThunk</code> is a gruesome,
- architecture-specific function in the RTS. It takes a stable
- pointer to the Haskell function to be run, and the address of the
- associated C wrapper, and returns a piece of machine code,
- which, when called from the outside (C) world, eventually calls
- through to <code>f</code>.
- <p>
- This machine code fragment is called the "Adjustor Thunk" (don't
- ask me why). What it does is simply to call onwards to the C
- helper
- function <code>run_mkCallback</code>, passing all the args given
- to it but also conveying <code>sp</code>, which is a stable
- pointer
- to the Haskell function to run. So:
- <p>
-<pre>
-createAdjustorThunk ( StablePtr sp, CCodeAddress addr_of_helper_C_fn )
-{
- create malloc'd piece of machine code "mc", behaving thusly:
-
- mc ( args_to_mc )
- {
- jump to addr_of_helper_C_fn, passing sp as an additional
- argument
- }
-</pre>
- <p>
- This is a horrible hack, because there is no portable way, even at
- the machine code level, to function which adds one argument and
- then transfers onwards to another C function. On x86s args are
- pushed R to L onto the stack, so we can just push <code>sp</code>,
- fiddle around with return addresses, and jump onwards to the
- helper C function. However, on architectures which use register
- windows and/or pass args extensively in registers (Sparc, Alpha,
- MIPS, IA64), this scheme borders on the unviable. GHC has a
- limited <code>createAdjustorThunk</code> implementation for Sparc
- and Alpha, which handles only the cases where all args, including
- the extra one, fit in registers.
- <p>
- Anyway: the other lump of code generated as a result of a
- f-x-dynamic declaration is the C helper stub. This is basically
- the same as in the static case, except that it only ever gets
- called from the adjustor thunk, and therefore must accept
- as an extra argument, a stable pointer to the Haskell function
- to run, naturally enough, as this is not known until run-time.
- It then dereferences the stable pointer and does the call in
- the same way as the f-x-static case:
-<pre>
-HsInt Main_d1kv ( StgStablePtr the_stableptr,
- void* original_return_addr,
- HsInt a1, HsInt a2 )
-{
- SchedulerStatus rc;
- HaskellObj ret;
- rc = rts_evalIO(
- rts_apply(rts_apply((StgClosure*)deRefStablePtr(the_stableptr),
- rts_mkInt(a1)
- ),
- rts_mkInt(a2)
- ),
- &ret
- );
- rts_checkSchedStatus("Main_d1kv",rc);
- return(rts_getInt(ret));
-}
-</pre>
- <p>
- Note how this function has a purely made-up name
- <code>Main_d1kv</code>, since unlike the f-x-static case, this
- function is never called from user code, only from the adjustor
- thunk.
- <p>
- Note also how the function takes a bogus parameter
- <code>original_return_addr</code>, which is part of this extra-arg
- hack. The usual scheme is to leave the original caller's return
- address in place and merely push the stable pointer above that,
- hence the spare parameter.
- <p>
- Finally, there is some extra trickery, detailed in
- <code>ghc/rts/Adjustor.c</code>, to get round the following
- problem: the adjustor thunk lives in mallocville. It is
- quite possible that the Haskell code will actually
- call <code>free()</code> on the adjustor thunk used to get to it
- -- because otherwise there is no way to reclaim the space used
- by the adjustor thunk. That's all very well, but it means that
- the C helper cannot return to the adjustor thunk in the obvious
- way, since we've already given it back using <code>free()</code>.
- So we leave, on the C stack, the address of whoever called the
- adjustor thunk, and before calling the helper, mess with the stack
- such that when the helper returns, it returns directly to the
- adjustor thunk's caller.
- <p>
- That's how the <code>stdcall</code> convention works. If the
- adjustor thunk has been called using the <code>ccall</code>
- convention, we return indirectly, via a statically-allocated
- yet-another-magic-piece-of-code, which takes care of removing the
- extra argument that the adjustor thunk pushed onto the stack.
- This is needed because in <code>ccall</code>-world, it is the
- caller who removes args after the call, and the original caller of
- the adjustor thunk has no way to know about the extra arg pushed
- by the adjustor thunk.
- <p>
- You didn't really want to know all this stuff, did you?
- <p>
-
-
-
- <b>(4) dynamic export of an non-IO-typed function from some module <code>MMM</code></b>
- <p>
- <code>foreign export mkCallback :: (Int -> Int -> Int) -> IO (FunPtr a)</code>
- <p>
- (4) relates to (3) as (2) relates to (1), that is, it's identical,
- except the C stub uses <code>rts_eval</code> instead of
- <code>rts_evalIO</code>.
- <p>
-
-
- <h2>Some perspective on f-x-dynamic</h2>
-
- The only really horrible problem with f-x-dynamic is how the
- adjustor thunk should pass to the C helper the stable pointer to
- use. Ideally we would like this to be conveyed via some invisible
- side channel, since then the adjustor thunk could simply jump
- directly to the C helper, with no non-portable stack fiddling.
- <p>
- Unfortunately there is no obvious candidate for the invisible
- side-channel. We've chosen to pass it on the stack, with the
- bad consequences detailed above. Another possibility would be to
- park it in a global variable, but this is non-reentrant and
- non-(OS-)thread-safe. A third idea is to put it into a callee-saves
- register, but that has problems too: the C helper may not use that
- register and therefore we will have trashed any value placed there
- by the caller; and there is no C-level portable way to read from
- the register inside the C helper.
- <p>
- In short, we can't think of a really satisfactory solution. I'd
- vote for introducing some kind of OS-thread-local-state and passing
- it in there, but that introduces complications of its own.
- <p>
- <b>OS-thread-safety</b> is of concern in the C stubs, whilst
- building up the expressions to run. These need to have exclusive
- access to the heap whilst allocating in it. Also, there needs to
- be some guarantee that no GC will happen in between the
- <code>deRefStablePtr</code> call and when <code>rts_eval[IO]</code>
- starts running. At the moment there are no guarantees for
- either property. This needs to be sorted out before the
- implementation can be regarded as fully safe to use.
-
-<p><small>
-
-<!-- hhmts start -->
-Last modified: Weds 27 Feb 02
-<!-- hhmts end -->
- </small>
- </body>
-</html>
diff --git a/docs/comm/the-beast/ghci.html b/docs/comm/the-beast/ghci.html
deleted file mode 100644
index b893acdeb4..0000000000
--- a/docs/comm/the-beast/ghci.html
+++ /dev/null
@@ -1,407 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - GHCi</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>The GHC Commentary - GHCi</h1>
-
- This isn't a coherent description of how GHCi works, sorry. What
- it is (currently) is a dumping ground for various bits of info
- pertaining to GHCi, which ought to be recorded somewhere.
-
- <h2>Debugging the interpreter</h2>
-
- The usual symptom is that some expression / program crashes when
- running on the interpreter (commonly), or gets wierd results
- (rarely). Unfortunately, finding out what the problem really is
- has proven to be extremely difficult. In retrospect it may be
- argued a design flaw that GHC's implementation of the STG
- execution mechanism provides only the weakest of support for
- automated internal consistency checks. This makes it hard to
- debug.
- <p>
- Execution failures in the interactive system can be due to
- problems with the bytecode interpreter, problems with the bytecode
- generator, or problems elsewhere. From the bugs seen so far,
- the bytecode generator is often the culprit, with the interpreter
- usually being correct.
- <p>
- Here are some tips for tracking down interactive nonsense:
- <ul>
- <li>Find the smallest source fragment which causes the problem.
- <p>
- <li>Using an RTS compiled with <code>-DDEBUG</code> (nb, that
- means the RTS from the previous stage!), run with <code>+RTS
- -D2</code> to get a listing in great detail from the
- interpreter. Note that the listing is so voluminous that
- this is impractical unless you have been diligent in
- the previous step.
- <p>
- <li>At least in principle, using the trace and a bit of GDB
- poking around at the time of death, you can figure out what
- the problem is. In practice you quickly get depressed at
- the hopelessness of ever making sense of the mass of
- details. Well, I do, anyway.
- <p>
- <li><code>+RTS -D2</code> tries hard to print useful
- descriptions of what's on the stack, and often succeeds.
- However, it has no way to map addresses to names in
- code/data loaded by our runtime linker. So the C function
- <code>ghci_enquire</code> is provided. Given an address, it
- searches the loaded symbol tables for symbols close to that
- address. You can run it from inside GDB:
- <pre>
- (gdb) p ghci_enquire ( 0x50a406f0 )
- 0x50a406f0 + -48 == `PrelBase_Czh_con_info'
- 0x50a406f0 + -12 == `PrelBase_Izh_static_info'
- 0x50a406f0 + -48 == `PrelBase_Czh_con_entry'
- 0x50a406f0 + -24 == `PrelBase_Izh_con_info'
- 0x50a406f0 + 16 == `PrelBase_ZC_con_entry'
- 0x50a406f0 + 0 == `PrelBase_ZMZN_static_entry'
- 0x50a406f0 + -36 == `PrelBase_Czh_static_entry'
- 0x50a406f0 + -24 == `PrelBase_Izh_con_entry'
- 0x50a406f0 + 64 == `PrelBase_EQ_static_info'
- 0x50a406f0 + 0 == `PrelBase_ZMZN_static_info'
- 0x50a406f0 + 48 == `PrelBase_LT_static_entry'
- $1 = void
- </pre>
- In this case the enquired-about address is
- <code>PrelBase_ZMZN_static_entry</code>. If no symbols are
- close to the given addr, nothing is printed. Not a great
- mechanism, but better than nothing.
- <p>
- <li>We have had various problems in the past due to the bytecode
- generator (<code>compiler/ghci/ByteCodeGen.lhs</code>) being
- confused about the true set of free variables of an
- expression. The compilation scheme for <code>let</code>s
- applies the BCO for the RHS of the let to its free
- variables, so if the free-var annotation is wrong or
- misleading, you end up with code which has wrong stack
- offsets, which is usually fatal.
- <p>
- <li>The baseline behaviour of the interpreter is to interpret
- BCOs, and hand all other closures back to the scheduler for
- evaluation. However, this causes a huge number of expensive
- context switches, so the interpreter knows how to enter the
- most common non-BCO closure types by itself.
- <p>
- These optimisations complicate the interpreter.
- If you think you have an interpreter problem, re-enable the
- define <code>REFERENCE_INTERPRETER</code> in
- <code>ghc/rts/Interpreter.c</code>. All optimisations are
- thereby disabled, giving the baseline
- I-only-know-how-to-enter-BCOs behaviour.
- <p>
- <li>Following the traces is often problematic because execution
- hops back and forth between the interpreter, which is
- traced, and compiled code, which you can't see.
- Particularly annoying is when the stack looks OK in the
- interpreter, then compiled code runs for a while, and later
- we arrive back in the interpreter, with the stack corrupted,
- and usually in a completely different place from where we
- left off.
- <p>
- If this is biting you baaaad, it may be worth copying
- sources for the compiled functions causing the problem, into
- your interpreted module, in the hope that you stay in the
- interpreter more of the time. Of course this doesn't work
- very well if you've defined
- <code>REFERENCE_INTERPRETER</code> in
- <code>ghc/rts/Interpreter.c</code>.
- <p>
- <li>There are various commented-out pieces of code in
- <code>Interpreter.c</code> which can be used to get the
- stack sanity-checked after every entry, and even after after
- every bytecode instruction executed. Note that some
- bytecodes (<code>PUSH_UBX</code>) leave the stack in
- an unwalkable state, so the <code>do_print_stack</code>
- local variable is used to suppress the stack walk after
- them.
- </ul>
-
-
- <h2>Useful stuff to know about the interpreter</h2>
-
- The code generation scheme is straightforward (naive, in fact).
- <code>-ddump-bcos</code> prints each BCO along with the Core it
- was generated from, which is very handy.
- <ul>
- <li>Simple lets are compiled in-line. For the general case, let
- v = E in ..., E is compiled into a new BCO which takes as
- args its free variables, and v is bound to AP(the new BCO,
- free vars of E).
- <p>
- <li><code>case</code>s as usual, become: push the return
- continuation, enter the scrutinee. There is some magic to
- make all combinations of compiled/interpreted calls and
- returns work, described below. In the interpreted case, all
- case alts are compiled into a single big return BCO, which
- commences with instructions implementing a switch tree.
- </ul>
- <p>
- <b>ARGCHECK magic</b>
- <p>
- You may find ARGCHECK instructions at the start of BCOs which
- don't appear to need them; case continuations in particular.
- These play an important role: they force objects which should
- evaluated to BCOs to actually be BCOs.
- <p>
- Typically, there may be an application node somewhere in the heap.
- This is a thunk which when leant on turns into a BCO for a return
- continuation. The thunk may get entered with an update frame on
- top of the stack. This is legitimate since from one viewpoint
- this is an AP which simply reduces to a data object, so does not
- have functional type. However, once the AP turns itself into a
- BCO (so to speak) we cannot simply enter the BCO, because that
- expects to see args on top of the stack, not an update frame.
- Therefore any BCO which expects something on the stack above an
- update frame, even non-function BCOs, start with an ARGCHECK. In
- this case it fails, the update is done, the update frame is
- removed, and the BCO re-entered. Subsequent entries of the BCO of
- course go unhindered.
- <p>
- The optimised (<code>#undef REFERENCE_INTERPRETER</code>) handles
- this case specially, so that a trip through the scheduler is
- avoided. When reading traces from <code>+RTS -D2 -RTS</code>, you
- may see BCOs which appear to execute their initial ARGCHECK insn
- twice. The first time it fails; the interpreter does the update
- immediately and re-enters with no further comment.
- <p>
- This is all a bit ugly, and, as SimonM correctly points out, it
- would have been cleaner to make BCOs unpointed (unthunkable)
- objects, so that a pointer to something <code>:: BCO#</code>
- really points directly at a BCO.
- <p>
- <b>Stack management</b>
- <p>
- There isn't any attempt to stub the stack, minimise its growth, or
- generally remove unused pointers ahead of time. This is really
- due to lazyness on my part, although it does have the minor
- advantage that doing something cleverer would almost certainly
- increase the number of bytecodes that would have to be executed.
- Of course we SLIDE out redundant stuff, to get the stack back to
- the sequel depth, before returning a HNF, but that's all. As
- usual this is probably a cause of major space leaks.
- <p>
- <b>Building constructors</b>
- <p>
- Constructors are built on the stack and then dumped into the heap
- with a single PACK instruction, which simply copies the top N
- words of the stack verbatim into the heap, adds an info table, and zaps N
- words from the stack. The constructor args are pushed onto the
- stack one at a time. One upshot of this is that unboxed values
- get pushed untaggedly onto the stack (via PUSH_UBX), because that's how they
- will be in the heap. That in turn means that the stack is not
- always walkable at arbitrary points in BCO execution, although
- naturally it is whenever GC might occur.
- <p>
- Function closures created by the interpreter use the AP-node
- (tagged) format, so although their fields are similarly
- constructed on the stack, there is never a stack walkability
- problem.
- <p>
- <b>Unpacking constructors</b>
- <p>
- At the start of a case continuation, the returned constructor is
- unpacked onto the stack, which means that unboxed fields have to
- be tagged. Rather than burdening all such continuations with a
- complex, general mechanism, I split it into two. The
- allegedly-common all-pointers case uses a single UNPACK insn
- to fish out all fields with no further ado. The slow case uses a
- sequence of more complex UPK_TAG insns, one for each field (I
- think). This seemed like a good compromise to me.
- <p>
- <b>Perspective</b>
- <p>
- I designed the bytecode mechanism with the experience of both STG
- hugs and Classic Hugs in mind. The latter has an small
- set of bytecodes, a small interpreter loop, and runs amazingly
- fast considering the cruddy code it has to interpret. The former
- had a large interpretative loop with many different opcodes,
- including multiple minor variants of the same thing, which
- made it difficult to optimise and maintain, yet it performed more
- or less comparably with Classic Hugs.
- <p>
- My design aims were therefore to minimise the interpreter's
- complexity whilst maximising performance. This means reducing the
- number of opcodes implemented, whilst reducing the number of insns
- despatched. In particular there are only two opcodes, PUSH_UBX
- and UPK_TAG, which deal with tags. STG Hugs had dozens of opcodes
- for dealing with tagged data. In cases where the common
- all-pointers case is significantly simpler (UNPACK) I deal with it
- specially. Finally, the number of insns executed is reduced a
- little by merging multiple pushes, giving PUSH_LL and PUSH_LLL.
- These opcode pairings were determined by using the opcode-pair
- frequency profiling stuff which is ifdef-d out in
- <code>Interpreter.c</code>. These significantly improve
- performance without having much effect on the uglyness or
- complexity of the interpreter.
- <p>
- Overall, the interpreter design is something which turned out
- well, and I was pleased with it. Unfortunately I cannot say the
- same of the bytecode generator.
-
- <h2><code>case</code> returns between interpreted and compiled code</h2>
-
- Variants of the following scheme have been drifting around in GHC
- RTS documentation for several years. Since what follows is
- actually what is implemented, I guess it supersedes all other
- documentation. Beware; the following may make your brain melt.
- In all the pictures below, the stack grows downwards.
- <p>
- <b>Returning to interpreted code</b>.
- <p>
- Interpreted returns employ a set of polymorphic return infotables.
- Each element in the set corresponds to one of the possible return
- registers (R1, D1, F1) that compiled code will place the returned
- value in. In fact this is a bit misleading, since R1 can be used
- to return either a pointer or an int, and we need to distinguish
- these cases. So, supposing the set of return registers is {R1p,
- R1n, D1, F1}, there would be four corresponding infotables,
- <code>stg_ctoi_ret_R1p_info</code>, etc. In the pictures below we
- call them <code>stg_ctoi_ret_REP_info</code>.
- <p>
- These return itbls are polymorphic, meaning that all 8 vectored
- return codes and the direct return code are identical.
- <p>
- Before the scrutinee is entered, the stack is arranged like this:
- <pre>
- | |
- +--------+
- | BCO | -------> the return contination BCO
- +--------+
- | itbl * | -------> stg_ctoi_ret_REP_info, with all 9 codes as follows:
- +--------+
- BCO* bco = Sp[1];
- push R1/F1/D1 depending on REP
- push bco
- yield to sched
- </pre>
- On entry, the interpreted contination BCO expects the stack to look
- like this:
- <pre>
- | |
- +--------+
- | BCO | -------> the return contination BCO
- +--------+
- | itbl * | -------> ret_REP_ctoi_info, with all 9 codes as follows:
- +--------+
- : VALUE : (the returned value, shown with : since it may occupy
- +--------+ multiple stack words)
- </pre>
- A machine code return will park the returned value in R1/F1/D1,
- and enter the itbl on the top of the stack. Since it's our magic
- itbl, this pushes the returned value onto the stack, which is
- where the interpreter expects to find it. It then pushes the BCO
- (again) and yields. The scheduler removes the BCO from the top,
- and enters it, so that the continuation is interpreted with the
- stack as shown above.
- <p>
- An interpreted return will create the value to return at the top
- of the stack. It then examines the return itbl, which must be
- immediately underneath the return value, to see if it is one of
- the magic <code>stg_ctoi_ret_REP_info</code> set. Since this is so,
- it knows it is returning to an interpreted contination. It
- therefore simply enters the BCO which it assumes it immediately
- underneath the itbl on the stack.
-
- <p>
- <b>Returning to compiled code</b>.
- <p>
- Before the scrutinee is entered, the stack is arranged like this:
- <pre>
- ptr to vec code 8 ------> return vector code 8
- | | ....
- +--------+ ptr to vec code 1 ------> return vector code 1
- | itbl * | -- Itbl end
- +--------+ \ ....
- \ Itbl start
- ----> direct return code
- </pre>
- The scrutinee value is then entered.
- The case continuation(s) expect the stack to look the same, with
- the returned HNF in a suitable return register, R1, D1, F1 etc.
- <p>
- A machine code return knows whether it is doing a vectored or
- direct return, and, if the former, which vector element it is.
- So, for a direct return we jump to <code>Sp[0]</code>, and for a
- vectored return, jump to <code>((CodePtr*)(Sp[0]))[ - ITBL_LENGTH
- - vector number ]</code>. This is (of course) the scheme that
- compiled code has been using all along.
- <p>
- An interpreted return will, as described just above, have examined
- the itbl immediately beneath the return value it has just pushed,
- and found it not to be one of the <code>ret_REP_ctoi_info</code> set,
- so it knows this must be a return to machine code. It needs to
- pop the return value, currently on the stack, into R1/F1/D1, and
- jump through the info table. Unfortunately the first part cannot
- be accomplished directly since we are not in Haskellised-C world.
- <p>
- We therefore employ a second family of magic infotables, indexed,
- like the first, on the return representation, and therefore with
- names of the form <code>stg_itoc_ret_REP_info</code>. (Note:
- <code>itoc</code>; the previous bunch were <code>ctoi</code>).
- This is pushed onto the stack (note, tagged values have their tag
- zapped), giving:
- <pre>
- | |
- +--------+
- | itbl * | -------> arbitrary machine code return itbl
- +--------+
- : VALUE : (the returned value, possibly multiple words)
- +--------+
- | itbl * | -------> stg_itoc_ret_REP_info, with code:
- +--------+
- pop myself (stg_itoc_ret_REP_info) off the stack
- pop return value into R1/D1/F1
- do standard machine code return to itbl at t.o.s.
- </pre>
- We then return to the scheduler, asking it to enter the itbl at
- t.o.s. When entered, <code>stg_itoc_ret_REP_info</code> removes
- itself from the stack, pops the return value into the relevant
- return register, and returns to the itbl to which we were trying
- to return in the first place.
- <p>
- Amazingly enough, this stuff all actually works! Well, mostly ...
- <p>
- <b>Unboxed tuples: a Right Royal Spanner In The Works</b>
- <p>
- The above scheme depends crucially on having magic infotables
- <code>stg_{itoc,ctoi}_ret_REP_info</code> for each return
- representation <code>REP</code>. It unfortunately fails miserably
- in the face of unboxed tuple returns, because the set of required
- tables would be infinite; this despite the fact that for any given
- unboxed tuple return type, the scheme could be made to work fine.
- <p>
- This is a serious problem, because it prevents interpreted
- code from doing <code>IO</code>-typed returns, since <code>IO
- t</code> is implemented as <code>(# t, RealWorld# #)</code> or
- thereabouts. This restriction in turn rules out FFI stuff in the
- interpreter. Not good.
- <p>
- Although we have no way to make general unboxed tuples work, we
- can at least make <code>IO</code>-types work using the following
- ultra-kludgey observation: <code>RealWorld#</code> doesn't really
- exist and so has zero size, in compiled code. In turn this means
- that a type of the form <code>(# t, RealWorld# #)</code> has the
- same representation as plain <code>t</code> does. So the bytecode
- generator, whilst rejecting code with general unboxed tuple
- returns, recognises and accepts this special case. Which means
- that <code>IO</code>-typed stuff works in the interpreter. Just.
- <p>
- If anyone asks, I will claim I was out of radio contact, on a
- 6-month walking holiday to the south pole, at the time this was
- ... er ... dreamt up.
-
-
-<p><small>
-
-<!-- hhmts start -->
-Last modified: Thursday February 7 15:33:49 GMT 2002
-<!-- hhmts end -->
- </small>
- </body>
-</html>
diff --git a/docs/comm/the-beast/main.html b/docs/comm/the-beast/main.html
deleted file mode 100644
index 332ffaa501..0000000000
--- a/docs/comm/the-beast/main.html
+++ /dev/null
@@ -1,35 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - Compiling and running the Main module</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>Compiling and running the Main module</h1>
-
-GHC allows you to determine which module contains the "main" function, and
-what that function is called, via the <code>-fmain-is</code> flag. The trouble is
-that the runtime system is fixed, so what symbol should it link to?
-<p>
-The current solution is this. Suppose the main function is <code>Foo.run</code>.
-<ul>
-<li>
-Then, when compiling module <code>Foo</code>, GHC adds an extra definition:
-<pre>
- :Main.main = runIO Foo.run
-</pre>
-Now the RTS can invoke <code>:Main.main</code> to start the program. (This extra
-definition is inserted in TcRnDriver.checkMain.)
-<p><li>
-Before starting the program, though, the RTS also initialises the module tree
-by calling <code>init_:Main</code>, so when compiling the main module (Foo in this case),
-as well as generating <code>init_Foo</code> as usual, GHC also generates
-<pre>
- init_zcMain() { init_Foo; }
-</pre>
-This extra initialisation code is generated in CodeGen.mkModuleInit.
-</ul>
-
- </body>
-</html>
diff --git a/docs/comm/the-beast/mangler.html b/docs/comm/the-beast/mangler.html
deleted file mode 100644
index 1ad80f0d5c..0000000000
--- a/docs/comm/the-beast/mangler.html
+++ /dev/null
@@ -1,79 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - The Evil Mangler</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>The GHC Commentary - The Evil Mangler</h1>
- <p>
- The Evil Mangler (EM) is a Perl script invoked by the <a
- href="driver.html">Glorious Driver</a> after the C compiler (gcc) has
- translated the GHC-produced C code into assembly. Consequently, it is
- only of interest if <code>-fvia-C</code> is in effect (either explicitly
- or implicitly).
-
- <h4>Its purpose</h4>
- <p>
- The EM reads the assembly produced by gcc and re-arranges code blocks as
- well as nukes instructions that it considers <em>non-essential.</em> It
- derives it evilness from its utterly ad hoc, machine, compiler, and
- whatnot dependent design and implementation. More precisely, the EM
- performs the following tasks:
- <ul>
- <li>The code executed when a closure is entered is moved adjacent to
- that closure's infotable. Moreover, the order of the info table
- entries is reversed. Also, SRT pointers are removed from closures that
- don't need them (non-FUN, RET and THUNK ones).
- <li>Function prologue and epilogue code is removed. (GHC generated code
- manages its own stack and uses the system stack only for return
- addresses and during calls to C code.)
- <li>Certain code patterns are replaced by simpler code (eg, loads of
- fast entry points followed by indirect jumps are replaced by direct
- jumps to the fast entry point).
- </ul>
-
- <h4>Implementation</h4>
- <p>
- The EM is located in the Perl script <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/driver/mangler/ghc-asm.lprl"><code>ghc-asm.lprl</code></a>.
- The script reads the <code>.s</code> file and chops it up into
- <em>chunks</em> (that's how they are actually called in the script) that
- roughly correspond to basic blocks. Each chunk is annotated with an
- educated guess about what kind of code it contains (e.g., infotable,
- fast entry point, slow entry point, etc.). The annotations also contain
- the symbol introducing the chunk of assembly and whether that chunk has
- already been processed or not.
- <p>
- The parsing of the input into chunks as well as recognising assembly
- instructions that are to be removed or altered is based on a large
- number of Perl regular expressions sprinkled over the whole code. These
- expressions are rather fragile as they heavily rely on the structure of
- the generated code - in fact, they even rely on the right amount of
- white space and thus on the formatting of the assembly.
- <p>
- Afterwards, the chunks are reordered, some of them purged, and some
- stripped of some useless instructions. Moreover, some instructions are
- manipulated (eg, loads of fast entry points followed by indirect jumps
- are replaced by direct jumps to the fast entry point).
- <p>
- The EM knows which part of the code belongs to function prologues and
- epilogues as <a href="../rts-libs/stgc.html">STG C</a> adds tags of the
- form <code>--- BEGIN ---</code> and <code>--- END ---</code> the
- assembler just before and after the code proper of a function starts.
- It adds these tags using gcc's <code>__asm__</code> feature.
- <p>
- <strong>Update:</strong> Gcc 2.96 upwards performs more aggressive basic
- block re-ordering and dead code elimination. This seems to make the
- whole <code>--- END ---</code> tag business redundant -- in fact, if
- proper code is generated, no <code>--- END ---</code> tags survive gcc
- optimiser.
-
- <p><small>
-<!-- hhmts start -->
-Last modified: Sun Feb 17 17:55:47 EST 2002
-<!-- hhmts end -->
- </small>
- </body>
-</html>
diff --git a/docs/comm/the-beast/modules.html b/docs/comm/the-beast/modules.html
deleted file mode 100644
index a6655a68a7..0000000000
--- a/docs/comm/the-beast/modules.html
+++ /dev/null
@@ -1,80 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - Modules, ModuleNames and Packages</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>Modules, ModuleNames and Packages</h1>
-
- <p>This section describes the datatypes <code>ModuleName</code>
- <code>Module</code> and <code>PackageName</code> all available
- from the module <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/Module.lhs"><code>Module</code></a>.<p>
-
- <h2>Packages</h2>
-
- <p>A package is a collection of (zero or more) Haskell modules,
- together with some information about external libraries, extra C
- compiler options, and other things that this collection of modules
- requires. When using DLLs on windows (or shared libraries on a
- Unix system; currently unsupported), a package can consist of only
- a single shared library of Haskell code; the reason for this is
- described below.
-
- <p>Packages are further described in the User's Guide <a
- href="http://www.haskell.org/ghc/docs/latest/packages.html">here</a>.
-
- <h2>The ModuleName type</h2>
-
- <p>At the bottom of the hierarchy is a <code>ModuleName</code>,
- which, as its name suggests, is simply the name of a module. It
- is represented as a Z-encoded FastString, and is an instance of
- <code>Uniquable</code> so we can build <code>FiniteMap</code>s
- with <code>ModuleName</code>s as the keys.
-
- <p>A <code>ModuleName</code> can be built from a
- <code>String</code>, using the <code>mkModuleName</code> function.
-
- <h2>The Module type</h2>
-
- <p>For a given module, the compiler also needs to know whether the
- module is in the <em>home package</em>, or in another package.
- This distinction is important for two reasons:
-
- <ul>
- <li><p>When generating code to call a function in another package,
- the compiler might have to generate a cross-DLL call, which is
- different from an intra-DLL call (hence the restriction that the
- code in a package can only reside in a single DLL).
-
- <li><p>We avoid putting version information in an interface file
- for entities defined in another package, on the grounds that other
- packages are generally "stable". This also helps keep the size of
- interface files down.
- </ul>
-
- <p>The <code>Module</code> type contains a <code>ModuleName</code>
- and a <code>PackageInfo</code> field. The
- <code>PackageInfo</code> indicates whether the given
- <code>Module</code> comes from the current package or from another
- package.
-
- <p>To get the actual package in which a given module resides, you
- have to read the interface file for that module, which contains
- the package name (actually the value of the
- <code>-package-name</code> flag when that module was built). This
- information is currently unused inside the compiler, but we might
- make use of it in the future, especially with the advent of
- hierarchical modules, to allow the compiler to automatically
- figure out which packages a program should be linked with, and
- thus avoid the need to specify <code>-package</code> options on
- the command line.
-
- <p><code>Module</code>s are also instances of
- <code>Uniquable</code>, and indeed the unique of a
- <code>Module</code> is the same as the unique of the underlying
- <code>ModuleName</code>.
- </body>
-</html>
diff --git a/docs/comm/the-beast/names.html b/docs/comm/the-beast/names.html
deleted file mode 100644
index 061fae3ebf..0000000000
--- a/docs/comm/the-beast/names.html
+++ /dev/null
@@ -1,169 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - The truth about names: OccNames, and Names</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>The GHC Commentary - The truth about names: OccNames, and Names</h1>
- <p>
- Every entity (type constructor, class, identifier, type variable) has a
- <code>Name</code>. The <code>Name</code> type is pervasive in GHC, and
- is defined in <code>basicTypes/Name.lhs</code>. Here is what a Name
- looks like, though it is private to the Name module.
- </p>
- <blockquote>
- <pre>
-data Name = Name {
- n_sort :: NameSort, -- What sort of name it is
- n_occ :: !OccName, -- Its occurrence name
- n_uniq :: Unique, -- Its identity
- n_loc :: !SrcLoc -- Definition site
- }</pre>
- </blockquote>
- <ul>
- <li> The <code>n_sort</code> field says what sort of name this is: see
- <a href="#sort">NameSort below</a>.
- <li> The <code>n_occ</code> field gives the "occurrence name" of the
- Name; see
- <a href="#occname">OccName below</a>.
- <li> The <code>n_uniq</code> field allows fast tests for equality of
- Names.
- <li> The <code>n_loc</code> field gives some indication of where the
- name was bound.
- </ul>
-
- <h2><a name="sort">The <code>NameSort</code> of a <code>Name</code></a></h2>
- <p>
- There are four flavours of <code>Name</code>:
- </p>
- <blockquote>
- <pre>
-data NameSort
- = External Module (Maybe Name)
- -- (Just parent) => this Name is a subordinate name of 'parent'
- -- e.g. data constructor of a data type, method of a class
- -- Nothing => not a subordinate
-
- | WiredIn Module (Maybe Name) TyThing BuiltInSyntax
- -- A variant of External, for wired-in things
-
- | Internal -- A user-defined Id or TyVar
- -- defined in the module being compiled
-
- | System -- A system-defined Id or TyVar. Typically the
- -- OccName is very uninformative (like 's')</pre>
- </blockquote>
- <ul>
- <li>Here are the sorts of Name an entity can have:
- <ul>
- <li> Class, TyCon: External.
- <li> Id: External, Internal, or System.
- <li> TyVar: Internal, or System.
- </ul>
- </li>
- <li>An <code>External</code> name has a globally-unique
- (module name, occurrence name) pair, namely the
- <em>original name</em> of the entity,
- describing where the thing was originally defined. So for example,
- if we have
- <blockquote>
- <pre>
-module M where
- f = e1
- g = e2
-
-module A where
- import qualified M as Q
- import M
- a = Q.f + g</pre>
- </blockquote>
- <p>
- then the RdrNames for "a", "Q.f" and "g" get replaced (by the
- Renamer) by the Names "A.a", "M.f", and "M.g" respectively.
- </p>
- </li>
- <li>An <code>InternalName</code>
- has only an occurrence name. Distinct InternalNames may have the same
- occurrence name; use the Unique to distinguish them.
- </li>
- <li>An <code>ExternalName</code> has a unique that never changes. It
- is never cloned. This is important, because the simplifier invents
- new names pretty freely, but we don't want to lose the connnection
- with the type environment (constructed earlier). An
- <code>InternalName</code> name can be cloned freely.
- </li>
- <li><strong>Before CoreTidy</strong>: the Ids that were defined at top
- level in the original source program get <code>ExternalNames</code>,
- whereas extra top-level bindings generated (say) by the type checker
- get <code>InternalNames</code>. q This distinction is occasionally
- useful for filtering diagnostic output; e.g. for -ddump-types.
- </li>
- <li><strong>After CoreTidy</strong>: An Id with an
- <code>ExternalName</code> will generate symbols that
- appear as external symbols in the object file. An Id with an
- <code>InternalName</code> cannot be referenced from outside the
- module, and so generates a local symbol in the object file. The
- CoreTidy pass makes the decision about which names should be External
- and which Internal.
- </li>
- <li>A <code>System</code> name is for the most part the same as an
- <code>Internal</code>. Indeed, the differences are purely cosmetic:
- <ul>
- <li>Internal names usually come from some name the
- user wrote, whereas a System name has an OccName like "a", or "t".
- Usually there are masses of System names with the same OccName but
- different uniques, whereas typically there are only a handful of
- distince Internal names with the same OccName.
- </li>
- <li>Another difference is that when unifying the type checker tries
- to unify away type variables with System names, leaving ones with
- Internal names (to improve error messages).
- </li>
- </ul>
- </li>
- </ul>
-
- <h2><a name="occname">Occurrence names: <code>OccName</code></a></h2>
- <p>
- An <code>OccName</code> is more-or-less just a string, like "foo" or
- "Tree", giving the (unqualified) name of an entity.
- </p>
- <p>
- Well, not quite just a string, because in Haskell a name like "C" could
- mean a type constructor or data constructor, depending on context. So
- GHC defines a type <tt>OccName</tt> (defined in
- <tt>basicTypes/OccName.lhs</tt>) that is a pair of a <tt>FastString</tt>
- and a <tt>NameSpace</tt> indicating which name space the name is drawn
- from:
- <blockquote>
- <pre>
-data OccName = OccName NameSpace EncodedFS</pre>
- </blockquote>
- <p>
- The <tt>EncodedFS</tt> is a synonym for <tt>FastString</tt> indicating
- that the string is Z-encoded. (Details in <tt>OccName.lhs</tt>.)
- Z-encoding encodes funny characters like '%' and '$' into alphabetic
- characters, like "zp" and "zd", so that they can be used in object-file
- symbol tables without confusing linkers and suchlike.
- </p>
- <p>
- The name spaces are:
- </p>
- <ul>
- <li> <tt>VarName</tt>: ordinary variables</li>
- <li> <tt>TvName</tt>: type variables</li>
- <li> <tt>DataName</tt>: data constructors</li>
- <li> <tt>TcClsName</tt>: type constructors and classes (in Haskell they
- share a name space) </li>
- </ul>
-
- <small>
-<!-- hhmts start -->
-Last modified: Wed May 4 14:57:55 EST 2005
-<!-- hhmts end -->
- </small>
- </body>
-</html>
-
diff --git a/docs/comm/the-beast/ncg.html b/docs/comm/the-beast/ncg.html
deleted file mode 100644
index 84beac2d51..0000000000
--- a/docs/comm/the-beast/ncg.html
+++ /dev/null
@@ -1,749 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - The Native Code Generator</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>The GHC Commentary - The Native Code Generator</h1>
- <p>
- On some platforms (currently x86 and PowerPC, with bitrotted
- support for Sparc and Alpha), GHC can generate assembly code
- directly, without having to go via C. This can sometimes almost
- halve compilation time, and avoids the fragility and
- horribleness of the <a href="mangler.html">mangler</a>. The NCG
- is enabled by default for
- non-optimising compilation on supported platforms. For most programs
- it generates code which runs only 1-3% slower
- (depending on platform and type of code) than that
- created by gcc on x86s, so it is well worth using even with
- optimised compilation. FP-intensive x86 programs see a bigger
- slowdown, and all Sparc code runs about 5% slower due to
- us not filling branch delay slots.
- <p>
- The NCG has always been something of a second-class citizen
- inside GHC, an unloved child, rather. This means that its
- integration into the compiler as a whole is rather clumsy, which
- brings some problems described below. That apart, the NCG
- proper is fairly cleanly designed, as target-independent as it
- reasonably can be, and so should not be difficult to retarget.
- <p>
- <b>NOTE!</b> The native code generator was largely rewritten as part
- of the C-- backend changes, around May 2004. Unfortunately the
- rest of this document still refers to the old version, and was written
- with relation to the CVS head as of end-Jan 2002. Some of it is relevant,
- some of it isn't.
-
- <h2>Overview</h2>
- The top-level code generator fn is
- <p>
- <code>absCtoNat :: AbstractC -> UniqSM (SDoc, Pretty.Doc)</code>
- <p>
- The returned <code>SDoc</code> is for debugging, so is empty unless
- you specify <code>-ddump-stix</code>. The <code>Pretty.Doc</code>
- bit is the final assembly code. Translation involves three main
- phases, the first and third of which are target-independent.
- <ul>
- <li><b>Translation into the <code>Stix</code> representation.</b> Stix
- is a simple tree-like RTL-style language, in which you can
- mention:
- <p>
- <ul>
- <li>An infinite number of temporary, virtual registers.
- <li>The STG "magic" registers (<code>MagicId</code>), such as
- the heap and stack pointers.
- <li>Literals and low-level machine ops (<code>MachOp</code>).
- <li>Simple address computations.
- <li>Reads and writes of: memory, virtual regs, and various STG
- regs.
- <li>Labels and <code>if ... goto ...</code> style control-flow.
- </ul>
- <p>
- Stix has two main associated types:
- <p>
- <ul>
- <li><code>StixStmt</code> -- trees executed for their side
- effects: assignments, control transfers, and auxiliary junk
- such as segment changes and literal data.
- <li><code>StixExpr</code> -- trees which denote a value.
- </ul>
- <p>
- Translation into Stix is almost completely
- target-independent. Needed dependencies are knowledge of
- word size and endianness, used when generating code to do
- deal with half-word fields in info tables. This could be
- abstracted out easily enough. Also, the Stix translation
- needs to know which <code>MagicId</code>s map to registers
- on the given target, and which are stored in offsets from
- <code>BaseReg</code>.
- <p>
- After initial Stix generation, the trees are cleaned up with
- constant-folding and a little copy-propagation ("Stix
- inlining", as the code misleadingly calls it). We take
- the opportunity to translate <code>MagicId</code>s which are
- stored in memory on the given target, into suitable memory
- references. Those which are stored in registers are left
- alone. There is also a half-hearted attempt to lift literal
- strings to the top level in cases where nested strings have
- been observed to give incorrect code in the past.
- <p>
- Primitive machine-level operations will already be phrased in
- terms of <code>MachOp</code>s in the presented Abstract C, and
- these are passed through unchanged. We comment only that the
- <code>MachOp</code>s have been chosen so as to be easy to
- implement on all targets, and their meaning is intended to be
- unambiguous, and the same on all targets, regardless of word
- size or endianness.
- <p>
- <b>A note on <code>MagicId</code>s.</b>
- Those which are assigned to
- registers on the current target are left unmodified. Those
- which are not are stored in memory as offsets from
- <code>BaseReg</code> (which is assumed to permanently have the
- value <code>(&MainCapability.r)</code>), so the constant folder
- calculates the offsets and inserts suitable loads/stores. One
- complication is that not all archs have <code>BaseReg</code>
- itself in a register, so for those (sparc), we instead
- generate the address as an offset from the static symbol
- <code>MainCapability</code>, since the register table lives in
- there.
- <p>
- Finally, <code>BaseReg</code> does occasionally itself get
- mentioned in Stix expression trees, and in this case what is
- denoted is precisely <code>(&MainCapability.r)</code>, not, as
- in all other cases, the value of memory at some offset from
- the start of the register table. Since what it denotes is an
- r-value and not an l-value, assigning <code>BaseReg</code> is
- meaningless, so the machinery checks to ensure this never
- happens. All these details are taken into account by the
- constant folder.
- <p>
- <li><b>Instruction selection.</b> This is the only majorly
- target-specific phase. It turns Stix statements and
- expressions into sequences of <code>Instr</code>, a data
- type which is different for each architecture.
- <code>Instr</code>, unsurprisingly, has various supporting
- types, such as <code>Reg</code>, <code>Operand</code>,
- <code>Imm</code>, etc. The generated instructions may refer
- to specific machine registers, or to arbitrary virtual
- registers, either those created within the instruction
- selector, or those mentioned in the Stix passed to it.
- <p>
- The instruction selectors live in <code>MachCode.lhs</code>.
- The core functions, for each target, are:
- <p>
- <code>
- getAmode :: StixExpr -> NatM Amode
- <br>getRegister :: StixExpr -> NatM Register
- <br>assignMem_IntCode :: PrimRep -> StixExpr -> StixExpr -> NatM InstrBlock
- <br>assignReg_IntCode :: PrimRep -> StixReg -> StixExpr -> NatM InstrBlock
- </code>
- <p>
- The insn selectors use the "maximal munch" algorithm. The
- bizarrely-misnamed <code>getRegister</code> translates
- expressions. A simplified version of its type is:
- <p>
- <code>getRegister :: StixExpr -> NatM (OrdList Instr, Reg)</code>
- <p>
- That is: it (monadically) turns a <code>StixExpr</code> into a
- sequence of instructions, and a register, with the meaning
- that after executing the (possibly empty) sequence of
- instructions, the (possibly virtual) register will
- hold the resulting value. The real situation is complicated
- by the presence of fixed registers, and is detailed below.
- <p>
- Maximal munch is a greedy algorithm and is known not to give
- globally optimal code sequences, but it is good enough, and
- fast and simple. Early incarnations of the NCG used something
- more sophisticated, but that is long gone now.
- <p>
- Similarly, <code>getAmode</code> translates a value, intended
- to denote an address, into a sequence of insns leading up to
- a (processor-specific) addressing mode. This stuff could be
- done using the general <code>getRegister</code> selector, but
- would necessarily generate poorer code, because the calculated
- address would be forced into a register, which might be
- unnecessary if it could partially or wholly be calculated
- using an addressing mode.
- <p>
- Finally, <code>assignMem_IntCode</code> and
- <code>assignReg_IntCode</code> create instruction sequences to
- calculate a value and store it in the given register, or at
- the given address. Because these guys translate a statement,
- not a value, they just return a sequence of insns and no
- associated register. Floating-point and 64-bit integer
- assignments have analogous selectors.
- <p>
- Apart from the complexities of fixed vs floating registers,
- discussed below, the instruction selector is as simple
- as it can be. It looks long and scary but detailed
- examination reveals it to be fairly straightforward.
- <p>
- <li><b>Register allocation.</b> The register allocator,
- <code>AsmRegAlloc.lhs</code> takes sequences of
- <code>Instr</code>s which mention a mixture of real and
- virtual registers, and returns a modified sequence referring
- only to real ones. It is gloriously and entirely
- target-independent. Well, not exactly true. Instead it
- regards <code>Instr</code> (instructions) and <code>Reg</code>
- (virtual and real registers) as abstract types, to which it has
- the following interface:
- <p>
- <code>
- insnFuture :: Instr -> InsnFuture
- <br>regUsage :: Instr -> RegUsage
- <br>patchRegs :: Instr -> (Reg -> Reg) -> Instr
- </code>
- <p>
- <code>insnFuture</code> is used to (re)construct the graph of
- all possible control transfers between the insns to be
- allocated. <code>regUsage</code> returns the sets of registers
- read and written by an instruction. And
- <code>patchRegs</code> is used to apply the allocator's final
- decision on virtual-to-real reg mapping to an instruction.
- <p>
- Clearly these 3 fns have to be written anew for each
- architecture. They are defined in
- <code>RegAllocInfo.lhs</code>. Think twice, no, thrice,
- before modifying them: making false claims about insn
- behaviour will lead to hard-to-find register allocation
- errors.
- <p>
- <code>AsmRegAlloc.lhs</code> contains detailed comments about
- how the allocator works. Here is a summary. The head honcho
- <p>
- <code>allocUsingTheseRegs :: [Instr] -> [Reg] -> (Bool, [Instr])</code>
- <p>
- takes a list of instructions and a list of real registers
- available for allocation, and maps as many of the virtual regs
- in the input into real ones as it can. The returned
- <code>Bool</code> indicates whether or not it was
- successful. If so, that's the end of it. If not, the caller
- of <code>allocUsingTheseRegs</code> will attempt spilling.
- More of that later. What <code>allocUsingTheseRegs</code>
- does is:
- <p>
- <ul>
- <li>Implicitly number each instruction by its position in the
- input list.
- <p>
- <li>Using <code>insnFuture</code>, create the set of all flow
- edges -- possible control transfers -- within this set of
- insns.
- <p>
- <li>Using <code>regUsage</code> and iterating around the flow
- graph from the previous step, calculate, for each virtual
- register, the set of flow edges on which it is live.
- <p>
- <li>Make a real-register committment map, which gives the set
- of edges for which each real register is committed (in
- use). These sets are initially empty. For each virtual
- register, attempt to find a real register whose current
- committment does not intersect that of the virtual
- register -- ie, is uncommitted on all edges that the
- virtual reg is live. If successful, this means the vreg
- can be assigned to the realreg, so add the vreg's set to
- the realreg's committment.
- <p>
- <li>If all the vregs were assigned to a realreg, use
- <code>patchInstr</code> to apply the mapping to the insns themselves.
- </ul>
- <p>
- <b>Spilling</b>
- <p>
- If <code>allocUsingTheseRegs</code> fails, a baroque
- mechanism comes into play. We now know that much simpler
- schemes are available to do the same thing and give better
- results.
- Anyways:
- <p>
- The logic above <code>allocUsingTheseRegs</code>, in
- <code>doGeneralAlloc</code> and <code>runRegAllocate</code>,
- observe that allocation has failed with some set R of real
- registers. So they apply <code>runRegAllocate</code> a second
- time to the code, but remove (typically) two registers from R
- before doing so. This naturally fails too, but returns a
- partially-allocated sequence. <code>doGeneralAlloc</code>
- then inserts spill code into the sequence, and finally re-runs
- <code>allocUsingTheseRegs</code>, but supplying the original,
- unadulterated R. This is guaranteed to succeed since the two
- registers previously removed from R are sufficient to allocate
- all the spill/restore instructions added.
- <p>
- Because x86 is very short of registers, and in the worst case
- needs three removed from R, a softly-softly approach is used.
- <code>doGeneralAlloc</code> first tries with zero regs removed
- from R, then if that fails one, then two, etc. This means
- <code>allocUsingTheseRegs</code> may get run several times
- before a successful arrangement is arrived at.
- <code>findReservedRegs</code> cooks up the sets of spill
- registers to try with.
- <p>
- The resulting machinery is complicated and the generated spill
- code is appalling. The saving grace is that spills are very
- rare so it doesn't matter much. I did not invent this -- I inherited it.
- <p>
- <b>Dealing with common cases fast</b>
- <p>
- The entire reg-alloc mechanism described so far is general and
- correct, but expensive overkill for many simple code blocks.
- So to begin with we use
- <code>doSimpleAlloc</code>, which attempts to do something
- simple. It exploits the observation that if the total number
- of virtual registers does not exceed the number of real ones
- available, we can simply dole out a new realreg each time we
- see mention of a new vreg, with no regard for control flow.
- <code>doSimpleAlloc</code> therefore attempts this in a
- single pass over the code. It gives up if it runs out of real
- regs or sees any condition which renders the above observation
- invalid (fixed reg uses, for example).
- <p>
- This clever hack handles the majority of code blocks quickly.
- It was copied from the previous reg-allocator (the
- Mattson/Partain/Marlow/Gill one).
- </ul>
-
-<p>
-<h2>Complications, observations, and possible improvements</h2>
-
-<h3>Real vs virtual registers in the instruction selectors</h3>
-
-The instruction selectors for expression trees, namely
-<code>getRegister</code>, are complicated by the fact that some
-expressions can only be computed into a specific register, whereas
-the majority can be computed into any register. We take x86 as an
-example, but the problem applies to all archs.
-<p>
-Terminology: <em>rreg</em> means real register, a real machine
-register. <em>vreg</em> means one of an infinite set of virtual
-registers. The type <code>Reg</code> is the sum of <em>rreg</em> and
-<em>vreg</em>. The instruction selector generates sequences with
-unconstrained use of vregs, leaving the register allocator to map them
-all into rregs.
-<p>
-Now, where was I ? Oh yes. We return to the type of
-<code>getRegister</code>, which despite its name, selects instructions
-to compute the value of an expression tree.
-<pre>
- getRegister :: StixExpr -> NatM Register
-
- data Register
- = Fixed PrimRep Reg InstrBlock
- | Any PrimRep (Reg -> InstrBlock)
-
- type InstrBlock -- sequence of instructions
-</pre>
-At first this looks eminently reasonable (apart from the stupid
-name). <code>getRegister</code>, and nobody else, knows whether or
-not a given expression has to be computed into a fixed rreg or can be
-computed into any rreg or vreg. In the first case, it returns
-<code>Fixed</code> and indicates which rreg the result is in. In the
-second case it defers committing to any specific target register by
-returning a function from <code>Reg</code> to <code>InstrBlock</code>,
-and the caller can specify the target reg as it sees fit.
-<p>
-Unfortunately, that forces <code>getRegister</code>'s callers (usually
-itself) to use a clumsy and confusing idiom in the common case where
-they do not care what register the result winds up in. The reason is
-that although a value might be computed into a fixed rreg, we are
-forbidden (on pain of segmentation fault :) from subsequently
-modifying the fixed reg. This and other rules are record in "Rules of
-the game" inside <code>MachCode.lhs</code>.
-<p>
-Why can't fixed registers be modified post-hoc? Consider a simple
-expression like <code>Hp+1</code>. Since the heap pointer
-<code>Hp</code> is definitely in a fixed register, call it R,
-<code>getRegister</code> on subterm <code>Hp</code> will simply return
-<code>Fixed</code> with an empty sequence and R. But we can't just
-emit an increment instruction for R, because that trashes
-<code>Hp</code>; instead we first have to copy it into a fresh vreg
-and increment that.
-<p>
-With all that in mind, consider now writing a <code>getRegister</code>
-clause for terms of the form <code>(1 + E)</code>. Contrived, yes,
-but illustrates the matter. First we do
-<code>getRegister</code> on E. Now we are forced to examine what
-comes back.
-<pre>
- getRegister (OnePlus e)
- = getRegister e `thenNat` \ e_result ->
- case e_result of
- Fixed e_code e_fixed
- -> returnNat (Any IntRep (\dst -> e_code ++ [MOV e_fixed dst, INC dst]))
- Any e_any
- -> Any (\dst -> e_any dst ++ [INC dst])
-</pre>
-This seems unreasonably cumbersome, yet the instruction selector is
-full of such idioms. A good example of the complexities induced by
-this scheme is shown by <code>trivialCode</code> for x86 in
-<code>MachCode.lhs</code>. This deals with general integer dyadic
-operations on x86 and has numerous cases. It was difficult to get
-right.
-<p>
-An alternative suggestion is to simplify the type of
-<code>getRegister</code> to this:
-<pre>
- getRegister :: StixExpr -> NatM (InstrBloc, VReg)
- type VReg = .... a vreg ...
-</pre>
-and then we could safely write
-<pre>
- getRegister (OnePlus e)
- = getRegister e `thenNat` \ (e_code, e_vreg) ->
- returnNat (e_code ++ [INC e_vreg], e_vreg)
-</pre>
-which is about as straightforward as you could hope for.
-Unfortunately, it requires <code>getRegister</code> to insert moves of
-values which naturally compute into an rreg, into a vreg. Consider:
-<pre>
- 1 + ccall some-C-fn
-</pre>
-On x86 the ccall result is returned in rreg <code>%eax</code>. The
-resulting sequence, prior to register allocation, would be:
-<pre>
- # push args
- call some-C-fn
- # move %esp to nuke args
- movl %eax, %vreg
- incl %vreg
-</pre>
-If, as is likely, <code>%eax</code> is not held live beyond this point
-for any other purpose, the move into a fresh register is pointless;
-we'd have been better off leaving the value in <code>%eax</code> as
-long as possible.
-<p>
-The simplified <code>getRegister</code> story is attractive. It would
-clean up the instruction selectors significantly and make it simpler
-to write new ones. The only drawback is that it generates redundant
-register moves. I suggest that eliminating these should be the job
-of the register allocator. Indeed:
-<ul>
-<li>There has been some work on this already ("Iterated register
- coalescing" ?), so this isn't a new idea.
-<p>
-<li>You could argue that the existing scheme inappropriately blurs the
- boundary between the instruction selector and the register
- allocator. The instruction selector should .. well .. just
- select instructions, without having to futz around worrying about
- what kind of registers subtrees get generated into. Register
- allocation should be <em>entirely</em> the domain of the register
- allocator, with the proviso that it should endeavour to allocate
- registers so as to minimise the number of non-redundant reg-reg
- moves in the final output.
-</ul>
-
-
-<h3>Selecting insns for 64-bit values/loads/stores on 32-bit platforms</h3>
-
-Note that this stuff doesn't apply on 64-bit archs, since the
-<code>getRegister</code> mechanism applies there.
-
-The relevant functions are:
-<pre>
- assignMem_I64Code :: StixExpr -> StixExpr -> NatM InstrBlock
- assignReg_I64Code :: StixReg -> StixExpr -> NatM InstrBlock
- iselExpr64 :: StixExpr -> NatM ChildCode64
-
- data ChildCode64 -- a.k.a "Register64"
- = ChildCode64
- InstrBlock -- code
- VRegUnique -- unique for the lower 32-bit temporary
-</pre>
-<code>iselExpr64</code> is the 64-bit, plausibly-named analogue of
-<code>getRegister</code>, and <code>ChildCode64</code> is the analogue
-of <code>Register</code>. The aim here was to generate working 64
-bit code as simply as possible. To this end, I used the
-simplified <code>getRegister</code> scheme described above, in which
-<code>iselExpr64</code>generates its results into two vregs which
-can always safely be modified afterwards.
-<p>
-Virtual registers are, unsurprisingly, distinguished by their
-<code>Unique</code>s. There is a small difficulty in how to
-know what the vreg for the upper 32 bits of a value is, given the vreg
-for the lower 32 bits. The simple solution adopted is to say that
-any low-32 vreg may also have a hi-32 counterpart which shares the
-same unique, but is otherwise regarded as a separate entity.
-<code>getHiVRegFromLo</code> gets one from the other.
-<pre>
- data VRegUnique
- = VRegUniqueLo Unique -- lower part of a split quantity
- | VRegUniqueHi Unique -- upper part thereof
-</pre>
-Apart from that, 64-bit code generation is really simple. The sparc
-and x86 versions are almost copy-n-pastes of each other, with minor
-adjustments for endianness. The generated code isn't wonderful but
-is certainly acceptable, and it works.
-
-
-
-<h3>Shortcomings and inefficiencies in the register allocator</h3>
-
-<h4>Redundant reconstruction of the control flow graph</h4>
-
-The allocator goes to considerable computational expense to construct
-all the flow edges in the group of instructions it's allocating for,
-by using the <code>insnFuture</code> function in the
-<code>Instr</code> pseudo-abstract type.
-<p>
-This is really silly, because all that information is present at the
-abstract C stage, but is thrown away in the translation to Stix.
-So a good thing to do is to modify that translation to
-produce a directed graph of Stix straight-line code blocks,
-and to preserve that structure through the insn selector, so the
-allocator can see it.
-<p>
-This would eliminate the fragile, hacky, arch-specific
-<code>insnFuture</code> mechanism, and probably make the whole
-compiler run measurably faster. Register allocation is a fair chunk
-of the time of non-optimising compilation (10% or more), and
-reconstructing the flow graph is an expensive part of reg-alloc.
-It would probably accelerate the vreg liveness computation too.
-
-<h4>Really ridiculous method for doing spilling</h4>
-
-This is a more ambitious suggestion, but ... reg-alloc should be
-reimplemented, using the scheme described in "Quality and speed in
-linear-scan register allocation." (Traub?) For straight-line code
-blocks, this gives an elegant one-pass algorithm for assigning
-registers and creating the minimal necessary spill code, without the
-need for reserving spill registers ahead of time.
-<p>
-I tried it in Rigr, replacing the previous spiller which used the
-current GHC scheme described above, and it cut the number of spill
-loads and stores by a factor of eight. Not to mention being simpler,
-easier to understand and very fast.
-<p>
-The Traub paper also describes how to extend their method to multiple
-basic blocks, which will be needed for GHC. It comes down to
-reconciling multiple vreg-to-rreg mappings at points where control
-flow merges.
-
-<h4>Redundant-move support for revised instruction selector suggestion</h4>
-
-As mentioned above, simplifying the instruction selector will require
-the register allocator to try and allocate source and destination
-vregs to the same rreg in reg-reg moves, so as to make as many as
-possible go away. Without that, the revised insn selector would
-generate worse code than at present. I know this stuff has been done
-but know nothing about it. The Linear-scan reg-alloc paper mentioned
-above does indeed mention a bit about it in the context of single
-basic blocks, but I don't know if that's sufficient.
-
-
-
-<h3>x86 arcana that you should know about</h3>
-
-The main difficulty with x86 is that many instructions have fixed
-register constraints, which can occasionally make reg-alloc fail
-completely. And the FPU doesn't have the flat register model which
-the reg-alloc abstraction (implicitly) assumes.
-<p>
-Our strategy is: do a good job for the common small subset, that is
-integer loads, stores, address calculations, basic ALU ops (+, -,
-and, or, xor), and jumps. That covers the vast majority of
-executed insns. And indeed we do a good job, with a loss of
-less than 2% compared with gcc.
-<p>
-Initially we tried to handle integer instructions with awkward
-register constraints (mul, div, shifts by non-constant amounts) via
-various jigglings of the spiller et al. This never worked robustly,
-and putting platform-specific tweaks in the generic infrastructure is
-a big No-No. (Not quite true; shifts by a non-constant amount are
-still done by a giant kludge, and should be moved into this new
-framework.)
-<p>
-Fortunately, all such insns are rare. So the current scheme is to
-pretend that they don't have any such constraints. This fiction is
-carried all the way through the register allocator. When the insn
-finally comes to be printed, we emit a sequence which copies the
-operands through memory (<code>%esp</code>-relative), satisfying the
-constraints of the real instruction. This localises the gruesomeness
-to just one place. Here, for example, is the code generated for
-integer divison of <code>%esi</code> by <code>%ecx</code>:
-<pre>
- # BEGIN IQUOT %ecx, %esi
- pushl $0
- pushl %eax
- pushl %edx
- pushl %ecx
- movl %esi,% eax
- cltd
- idivl 0(%esp)
- movl %eax, 12(%esp)
- popl %edx
- popl %edx
- popl %eax
- popl %esi
- # END IQUOT %ecx, %esi
-</pre>
-This is not quite as appalling as it seems, if you consider that the
-division itself typically takes 16+ cycles, whereas the rest of the
-insns probably go through in about 1 cycle each.
-<p>
-This trick is taken to extremes for FP operations.
-<p>
-All notions of the x86 FP stack and its insns have been removed.
-Instead, we pretend, to the instruction selector and register
-allocator, that x86 has six floating point registers,
-<code>%fake0</code> .. <code>%fake5</code>, which can be used in the
-usual flat manner. We further claim that x86 has floating point
-instructions very similar to SPARC and Alpha, that is, a simple
-3-operand register-register arrangement. Code generation and register
-allocation proceed on this basis.
-<p>
-When we come to print out the final assembly, our convenient fiction
-is converted to dismal reality. Each fake instruction is
-independently converted to a series of real x86 instructions.
-<code>%fake0</code> .. <code>%fake5</code> are mapped to
-<code>%st(0)</code> .. <code>%st(5)</code>. To do reg-reg arithmetic
-operations, the two operands are pushed onto the top of the FP stack,
-the operation done, and the result copied back into the relevant
-register. When one of the operands is also the destination, we emit a
-slightly less scummy translation. There are only six
-<code>%fake</code> registers because 2 are needed for the translation,
-and x86 has 8 in total.
-<p>
-The translation is inefficient but is simple and it works. A cleverer
-translation would handle a sequence of insns, simulating the FP stack
-contents, would not impose a fixed mapping from <code>%fake</code> to
-<code>%st</code> regs, and hopefully could avoid most of the redundant
-reg-reg moves of the current translation.
-<p>
-There are, however, two unforeseen bad side effects:
-<ul>
-<li>This doesn't work properly, because it doesn't observe the normal
- conventions for x86 FP code generation. It turns out that each of
- the 8 elements in the x86 FP register stack has a tag bit which
- indicates whether or not that register is notionally in use or
- not. If you do a FPU operation which happens to read a
- tagged-as-empty register, you get an x87 FPU (stack invalid)
- exception, which is normally handled by the FPU without passing it
- to the OS: the program keeps going, but the resulting FP values
- are garbage. The OS can ask for the FPU to pass it FP
- stack-invalid exceptions, but it usually doesn't.
- <p>
- Anyways: inside NCG created x86 FP code this all works fine.
- However, the NCG's fiction of a flat register set does not operate
- the x87 register stack in the required stack-like way. When
- control returns to a gcc-generated world, the stack tag bits soon
- cause stack exceptions, and thus garbage results.
- <p>
- The only fix I could think of -- and it is horrible -- is to clear
- all the tag bits just before the next STG-level entry, in chunks
- of code which use FP insns. <code>i386_insert_ffrees</code>
- inserts the relevant <code>ffree</code> insns into such code
- blocks. It depends critically on <code>is_G_instr</code> to
- detect such blocks.
-<p>
-<li>It's very difficult to read the generated assembly and
- reason about it when debugging, because there's so much clutter.
- We print the fake insns as comments in the output, and that helps
- a bit.
-</ul>
-
-
-
-<h3>Generating code for ccalls</h3>
-
-For reasons I don't really understand, the instruction selectors for
-generating calls to C (<code>genCCall</code>) have proven surprisingly
-difficult to get right, and soaked up a lot of debugging time. As a
-result, I have once again opted for schemes which are simple and not
-too difficult to argue as correct, even if they don't generate
-excellent code.
-<p>
-The sparc ccall generator in particular forces all arguments into
-temporary virtual registers before moving them to the final
-out-registers (<code>%o0</code> .. <code>%o5</code>). This creates
-some unnecessary reg-reg moves. The reason is explained in a
-comment in the code.
-
-
-<h3>Duplicate implementation for many STG macros</h3>
-
-This has been discussed at length already. It has caused a couple of
-nasty bugs due to subtle untracked divergence in the macro
-translations. The macro-expander really should be pushed up into the
-Abstract C phase, so the problem can't happen.
-<p>
-Doing so would have the added benefit that the NCG could be used to
-compile more "ways" -- well, at least the 'p' profiling way.
-
-
-<h3>How to debug the NCG without losing your sanity/hair/cool</h3>
-
-Last, but definitely not least ...
-<p>
-The usual syndrome is that some program, when compiled via C, works,
-but not when compiled via the NCG. Usually the problem is fairly
-simple to fix, once you find the specific code block which has been
-mistranslated. But the latter can be nearly impossible, since most
-modules generate at least hundreds and often thousands of them.
-<p>
-My solution: cheat.
-<p>
-Because the via-C and native routes diverge only late in the day,
-it is not difficult to construct a 1-1 correspondence between basic
-blocks on the two routes. So, if the program works via C but not on
-the NCG, do the following:
-<ul>
-<li>Recompile <code>AsmCodeGen.lhs</code> in the afflicted compiler
- with <code>-DDEBUG_NCG</code>, so that it inserts
- <code>___ncg_debug_marker</code>s
- into the assembly it emits.
-<p>
-<li>Using a binary search on modules, find the module which is causing
- the problem.
-<p>
-<li>Compile that module to assembly code, with identical flags, twice,
- once via C and once via NCG.
- Call the outputs <code>ModuleName.s-gcc</code> and
- <code>ModuleName.s-nat</code>. Check that the latter does indeed have
- <code>___ncg_debug_marker</code>s in it; otherwise the next steps fail.
-<p>
-<li>Build (with a working compiler) the program
- <code>fptools/ghc/utils/debugNCG/diff_gcc_nat</code>.
-<p>
-<li>Run: <code>diff_gcc_nat ModuleName.s</code>. This will
- construct the 1-1 correspondence, and emits on stdout
- a cppable assembly output. Place this in a file -- I always
- call it <code>synth.S</code>. Note, the capital S is important;
- otherwise it won't get cpp'd. You can feed this file directly to
- ghc and it will automatically get cpp'd; you don't have to do so
- yourself.
-<p>
-<li>By messing with the <code>#define</code>s at the top of
- <code>synth.S</code>, do a binary search to find the incorrect
- block. Keep a careful record of where you are in the search; it
- is easy to get confused. Remember also that multiple blocks may
- be wrong, which also confuses matters. Finally, I usually start
- off by re-checking that I can build the executable with all the
- <code>#define</code>s set to 0 and then all to 1. This ensures
- you won't get halfway through the search and then get stuck due to
- some snafu with gcc-specific literals. Usually I set
- <code>UNMATCHED_GCC</code> to 1 all the time, and this bit should
- contain only literal data.
- <code>UNMATCHED_NAT</code> should be empty.
-</ul>
-<p>
-<code>diff_gcc_nat</code> was known to work correctly last time I used
-it, in December 01, for both x86 and sparc. If it doesn't work, due
-to changes in assembly syntax, or whatever, make it work. The
-investment is well worth it. Searching for the incorrect block(s) any
-other way is a total time waster.
-
-
-
-</ul>
-
-
-
-
- <p><small>
-<!-- hhmts start -->
-Last modified: Mon Aug 19 11:41:43 CEST 2013
-<!-- hhmts end -->
- </small>
- </body>
-</html>
diff --git a/docs/comm/the-beast/optimistic.html b/docs/comm/the-beast/optimistic.html
deleted file mode 100644
index 4d158022e8..0000000000
--- a/docs/comm/the-beast/optimistic.html
+++ /dev/null
@@ -1,65 +0,0 @@
-<h2> Architectural stuff </h2>
-
-New fields in the TSO:
-<ul>
-<li> New global speculation-depth register; always counts the number of specuation frames
-on the stack; incremented when
-starting speculation, decremented when finishing.
-<li> Profiling stuff
-</ul>
-
-
-<h2> Speculation frames </h2>
-
-The info table for a speculation frame points to the static spec-depth configuration
-for that speculation point. (Points to, because the config is mutable, and the info
-table has to be adjacent to the (immutable) code.)
-
-
-
-<h2> Abortion</h2>
-
-Abortion is modelled by a special asynchronous exception ThreadAbort.
-
-<ul>
-<li> In the scheduler, if a thread returns with ThreadBlocked, and non-zero SpecDepth, send it
-an asynchronous exception.
-
-<li> In the implementation of the <tt>catch#</tt> primop, raise an asynchonous exception if
-SpecDepth is nonzero.
-
-<li> Timeout, administered by scheduler. Current story: abort if a speculation frame lasts from
-one minor GC to the next. We detect this by seeing if there's a profiling frame on the stack --- a
-profiling frame is added at a minor GC in place of a speculation frame (see Online Profiling).
-</ul>
-
-
-When tearing frames off the stack, we start a new chunk at every speculation frame, as well as every
-update frame. We proceed down to the deepest speculation frame.
-<p>
-The <tt>AP_STACK</tt> closure built for a speculation frame must be careful <em>not</em> to enter the
-next <tt>AP_STACK</tt> closure up, because that would re-enter a possible loop.
-<p>
-Delivering an asynch exception to a thread that is speculating. Invariant: there can be no catch frames
-inside speculation (we abort in <tt>catch#</tt> when speculating. So the asynch exception just
-tears off frames in the standard way until it gets to a catch frame, just as it would usually do.
-<p>
-Abortion can punish one or more of the speculation frames by decrementing their static config variables.
-
-<h3>Synchronous exceptions</h3>
-
-Synchronous exceptions are treated similarly as before. The stack is discarded up to an update frame; the
-thunk to be updated is overwritten with "raise x", and the process continues. Until a catch frame.
-<p>
-When we find a spec frame, we allocate a "raise x" object, and resume execution with the return address
-in the spec frame. In that way the spec frame is like a catch frame; it stops the unwinding process.
-<p>
-It's essential that every hard failure is caught, else speculation is unsafe. In particular, divide by zero
-is hard to catch using OS support, so we test explicitly in library code. You can shoot yourself in the foot
-by writing <tt>x `div#` 0</tt>, side-stepping the test.
-
-
-<h3> Online profiling </h3>
-
-Sampling can be more frequent than minor GC (by jiggling the end-of-block code) but cannot
-be less frequent, because GC doesn't expect to see profiling frames. \ No newline at end of file
diff --git a/docs/comm/the-beast/prelude.html b/docs/comm/the-beast/prelude.html
deleted file mode 100644
index 64b607def5..0000000000
--- a/docs/comm/the-beast/prelude.html
+++ /dev/null
@@ -1,207 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - Primitives and the Prelude</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>The GHC Commentary - Primitives and the Prelude</h1>
- <p>
- One of the trickiest aspects of GHC is the delicate interplay
- between what knowledge is baked into the compiler, and what
- knowledge it gets by reading the interface files of library
- modules. In general, the less that is baked in, the better.
-<p>
- Most of what the compiler has to have wired in about primitives and
- prelude definitions is in
- <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/"><code>fptools/ghc/compiler/prelude/</code></a>.
- </p>
-
-GHC recognises these main classes of baked-in-ness:
-<dl>
-<dt><strong>Primitive types.</strong>
-<dd>Primitive types cannot be defined in Haskell, and are utterly baked into the compiler.
-They are notionally defined in the fictional module <tt>GHC.Prim</tt>. The <tt>TyCon</tt>s for these types are all defined
-in module <tt>TysPrim</tt>; for example,
-<pre>
- intPrimTyCon :: TyCon
- intPrimTyCon = ....
-</pre>
-Examples:
-<tt>Int#, Float#, Addr#, State#</tt>.
-<p>
-<dt><strong>Wired-in types.</strong>
-<dd>Wired-in types can be defined in Haskell, and indeed are (many are defined in </tt>GHC.Base</tt>).
-However, it's very convenient for GHC to be able to use the type constructor for (say) <tt>Int</tt>
-without looking it up in any environment. So module <tt>TysWiredIn</tt> contains many definitions
-like this one:
-<pre>
- intTyCon :: TyCon
- intTyCon = ....
-
- intDataCon :: DataCon
- intDataCon = ....
-</pre>
-However, since a <tt>TyCon</tt> value contains the entire type definition inside it, it follows
-that the complete definition of <tt>Int</tt> is thereby baked into the compiler.
-<p>
-Nevertheless, the library module <tt>GHC.Base</tt> still contains a definition for <tt>Int</tt>
-just so that its info table etc get generated somewhere. Chaos will result if the wired-in definition
-in <tt>TysWiredIn</tt> differs from that in <tt>GHC.Base</tt>.
-<p>
-The rule is that only very simple types should be wired in (for example, <tt>Ratio</tt> is not,
-and <tt>IO</tt> is certainly not). No class is wired in: classes are just too complicated.
-<p>
-Examples: <tt>Int</tt>, <tt>Float</tt>, <tt>List</tt>, tuples.
-
-<p>
-<dt><strong>Known-key things.</strong>
-<dd>GHC knows of the existence of many, many other types, classes and values. <em>But all it knows is
-their <tt>Name</tt>.</em> Remember, a <tt>Name</tt> includes a unique key that identifies the
-thing, plus its defining module and occurrence name
-(see <a href="names.html">The truth about Names</a>). Knowing a <tt>Name</tt>, therefore, GHC can
-run off to the interface file for the module and find out everything else it might need.
-<p>
-Most of these known-key names are defined in module <tt>PrelNames</tt>; a further swathe concerning
-Template Haskell are defined in <tt>DsMeta</tt>. The allocation of unique keys is done manually;
-chaotic things happen if you make a mistake here, which is why they are all together.
-</dl>
-
-All the <tt>Name</tt>s from all the above categories are used to initialise the global name cache,
-which maps (module,occurrence-name) pairs to the globally-unique <tt>Name</tt> for that
-thing. (See <tt>HscMain.initOrigNames</tt>.)
-
-<p>
-The next sections elaborate these three classes a bit.
-
-
- <h2>Primitives (module <tt>TysPrim</tt>)</h2>
- <p>
- Some types and functions have to be hardwired into the compiler as they
- are atomic; all other code is essentially built around this primitive
- functionality. This includes basic arithmetic types, such as integers,
- and their elementary operations as well as pointer types. Primitive
- types and functions often receive special treatment in the code
- generator, which means that these entities have to be explicitly
- represented in the compiler. Moreover, many of these types receive some
- explicit treatment in the runtime system, and so, there is some further
- information about <a href="../rts-libs/primitives.html">primitives in
- the RTS section</a> of this document.
- <p>
- The module <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/TysPrim.lhs"><code>TysPrim</code></a>
- exports a list of all primitive type constructors as <code>primTyCons ::
- [TyCon]</code>. All of these type constructors (of type
- <code>TyCon</code>) are also exported as <code>intPrimTyCon</code>,
- <code>stablePtrPrimTyCon</code>, and so on. In addition, for each
- nullary type constructor the corresponding type (of type
- <code>Type</code>) is also exported; for example, we have
- <code>intPrimTy :: Type</code>. For all other type constructors, a
- function is exported that constructs the type obtained by applying the
- type constructors to an argument type (of type <code>Type</code>); for
- example, we have <code>mkStablePtrPrimTy :: Type -> Type</code>.
- <p>
- As it is inconvenient to identify type that receive a special treatment
- by the code generator by looking at their name, the module <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/PrimRep.lhs"><code>PrimRep</code></a>
- exports a data type <code>PrimRep</code>, which lists all
- machine-manipulable implementation types. The module also exports a set
- of query functions on <code>PrimRep</code> that define properties, such
- as a type's byte size or whether a primitive type is a pointer type.
- Moreover, the function <code>TysPrim.primRepTyCon :: PrimRep ->
- TyCon</code> converts <code>PrimRep</code> values into the corresponding
- type constructor.
-
- <h2>Wired in types (module <tt>TysWiredIn</tt>)</h2>
- <p>
- In addition to entities that are primitive, as the compiler has to treat
- them specially in the backend, there is a set of types, functions,
- etc. that the Haskell language definition flags as essential to the
- language by placing them into the special module <code>Prelude</code>
- that is implicitly imported into each Haskell module. For some of these
- entities it suffices to define them (by standard Haskell definitions) in
- a <code>Prelude</code> module and ensuring that this module is treated
- specially by being always imported .
- <p>
- However, there is a set of entities (such as, for example, the list type
- and the corresponding data constructors) that have an inbetween status:
- They are not truly primitive (lists, for example, can easily be defined
- by a <code>data</code> declaration), but the compiler has to have extra
- knowledge about them, as they are associated with some particular
- features of the language (in the case of lists, there is special syntax,
- such as list comprehensions, associated with the type). Another
- example, for a special kind of entity are type classes that can be used
- in a <code>deriving</code> clause. All types that are not-primitive,
- but about which the compiler nonetheless has to have some extra
- knowledge are defined in the module <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/TysWiredIn.lhs"><code>TysWiredIn</code></a>.
- <p>
- All wired in type constructors are contained in <code>wiredInTyCons ::
- [TyCon]</code>. In addition to that list, <code>TysWiredIn</code>
- exports variables bound to representations of all listed type
- constructors and their data constructors. So, for example, we have
- <code>listTyCon</code> together with <code>nilDataCon</cons> and
- </code>consDataCon</code>. There are also convenience functions, such
- as <code>mkListTy</code> and <code>mkTupleTy</code>, which construct
- compound types.
- <p>
-
- <h2>Known-key names (module <tt>PrelNames</tt>)</h2>
-
- All names of types, functions, etc. known to the compiler are defined in
- <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/PrelNames.lhs"><code>PrelNames</code></a>.
- This includes the names of types and functions exported from
- <code>TysWiredIn</code>, but also others. In particular, this module
- also fixes the names of all prelude modules; i.e., of the modules whose
- name starts with <code>Prel</code>, which GHC's library uses to bring
- some structure into the quite large number of <code>Prelude</code>
- definitions.
- <p>
- <code>PrelNames.knownKeyNames :: [Name]</code> contains all names known
- to the compiler, but the elements of the list are also exported
- individually as variables, such as <code>floatTyConName</code> (having
- the lexeme <code>Float</code>) and <code>floatDataConName</code> (having
- the lexeme <code>F#</code>). For each of these names,
- <code>PrelNames</code> derfines a unique key with a definition, such as
- <p>
-<blockquote><pre>
-floatPrimTyConKey = mkPreludeTyConUnique 11</pre>
-</blockquote>
- <p>
- that is, all unique keys for known prelude names are hardcoded into
- <code>PrelNames</code> (and uniqueness has to be manually ensured in
- that module). To simplify matching the types of important groups of
- type constructors, <code>PrelNames</code> also exports lists, such as
- <code>numericTyKeys</code> (keys of all numeric types), that contain the
- unique keys of all names in that group. In addition, derivable type
- classes and their structure is defined by
- <code>derivableClassKeys</code> and related definitions.
- <p>
- In addition to names that have unique keys, <code>PrelNames</code> also
- defines a set of names without uniqueness information. These names end
- on the suffix <code>_RDR</code> and are of type <code>RdrName</code> (an
- example, is <code>times_RDR</code>, which represents the lexeme
- <code>*</code>). The names are used in locations where they pass
- through the renamer anyway (e.g., special constructors encountered by
- the parser, such as [], and code generated from deriving clauses), which
- will take care of adding uniqueness information.
- <p>
-
-<h2>Gathering it all together (module <tt>PrelInfo</tt>)</h2>
- The module
- <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/PrelInfo.lhs"><code>PrelInfo</code></a>
- in some sense ties all the above together and provides a reasonably
- restricted interface to these definition to the rest of the compiler.
- However, from what I have seen, this doesn't quite work out and the
- earlier mentioned modules are directly imported in many places.
-
- <p><small>
-<!-- hhmts start -->
-Last modified: Tue Dec 11 17:54:07 EST 2001
-<!-- hhmts end -->
- </small>
- </body>
-</html>
diff --git a/docs/comm/the-beast/renamer.html b/docs/comm/the-beast/renamer.html
deleted file mode 100644
index 878e82b370..0000000000
--- a/docs/comm/the-beast/renamer.html
+++ /dev/null
@@ -1,249 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - The Glorious Renamer</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>The GHC Commentary - The Glorious Renamer</h1>
- <p>
- The <em>renamer</em> sits between the parser and the typechecker.
- However, its operation is quite tightly interwoven with the
- typechecker. This is partially due to support for Template Haskell,
- where spliced code has to be renamed and type checked. In particular,
- top-level splices lead to multiple rounds of renaming and type
- checking.
- </p>
- <p>
- The main externally used functions of the renamer are provided by the
- module <code>rename/RnSource.lhs</code>. In particular, we have
- </p>
- <blockquote>
- <pre>
-rnSrcDecls :: HsGroup RdrName -> RnM (TcGblEnv, HsGroup Name)
-rnTyClDecls :: [LTyClDecl RdrName] -> RnM [LTyClDecl Name]
-rnSplice :: HsSplice RdrName -> RnM (HsSplice Name, FreeVars)</pre>
- </blockquote>
- <p>
- All of which execute in the renamer monad <code>RnM</code>. The first
- function, <code>rnSrcDecls</code> renames a binding group; the second,
- <code>rnTyClDecls</code> renames a list of (toplevel) type and class
- declarations; and the third, <code>rnSplice</code> renames a Template
- Haskell splice. As the types indicate, the main task of the renamer is
- to convert converts all the <tt>RdrNames</tt> to <a
- href="names.html"><tt>Names</tt></a>, which includes a number of
- well-formedness checks (no duplicate declarations, all names are in
- scope, and so on). In addition, the renamer performs other, not
- strictly name-related, well-formedness checks, which includes checking
- that the appropriate flags have been supplied whenever language
- extensions are used in the source.
- </p>
-
- <h2>RdrNames</h2>
- <p>
- A <tt>RdrName.RdrName</tt> is pretty much just a string (for an
- unqualified name like "<tt>f</tt>") or a pair of strings (for a
- qualified name like "<tt>M.f</tt>"):
- </p>
- <blockquote>
- <pre>
-data RdrName
- = Unqual OccName
- -- Used for ordinary, unqualified occurrences
-
- | Qual Module OccName
- -- A qualified name written by the user in
- -- *source* code. The module isn't necessarily
- -- the module where the thing is defined;
- -- just the one from which it is imported
-
- | Orig Module OccName
- -- An original name; the module is the *defining* module.
- -- This is used when GHC generates code that will be fed
- -- into the renamer (e.g. from deriving clauses), but where
- -- we want to say "Use Prelude.map dammit".
-
- | Exact Name
- -- We know exactly the Name. This is used
- -- (a) when the parser parses built-in syntax like "[]"
- -- and "(,)", but wants a RdrName from it
- -- (b) when converting names to the RdrNames in IfaceTypes
- -- Here an Exact RdrName always contains an External Name
- -- (Internal Names are converted to simple Unquals)
- -- (c) by Template Haskell, when TH has generated a unique name</pre>
- </blockquote>
- <p>
- The OccName type is described in <a href="names.html#occname">The
- truth about names</a>.
- </p>
-
- <h2>The Renamer Monad</h2>
- <p>
- Due to the tight integration of the renamer with the typechecker, both
- use the same monad in recent versions of GHC. So, we have
- </p>
- <blockquote>
- <pre>
-type RnM a = TcRn a -- Historical
-type TcM a = TcRn a -- Historical</pre>
- </blockquote>
- <p>
- with the combined monad defined as
- </p>
- <blockquote>
- <pre>
-type TcRn a = TcRnIf TcGblEnv TcLclEnv a
-type TcRnIf a b c = IOEnv (Env a b) c
-
-data Env gbl lcl -- Changes as we move into an expression
- = Env {
- env_top :: HscEnv, -- Top-level stuff that never changes
- -- Includes all info about imported things
-
- env_us :: TcRef UniqSupply, -- Unique supply for local varibles
-
- env_gbl :: gbl, -- Info about things defined at the top level
- -- of the module being compiled
-
- env_lcl :: lcl -- Nested stuff; changes as we go into
- -- an expression
- }</pre>
- </blockquote>
- <p>
- the details of the global environment type <code>TcGblEnv</code> and
- local environment type <code>TcLclEnv</code> are also defined in the
- module <code>typecheck/TcRnTypes.lhs</code>. The monad
- <code>IOEnv</code> is defined in <code>utils/IOEnv.hs</code> and extends
- the vanilla <code>IO</code> monad with an additional state parameter
- <code>env</code> that is treated as in a reader monad. (Side effecting
- operations, such as updating the unique supply, are done with
- <code>TcRef</code>s, which are simply a synonym for <code>IORef</code>s.)
- </p>
-
- <h2>Name Space Management</h2>
- <p>
- As anticipated by the variants <code>Orig</code> and <code>Exact</code>
- of <code>RdrName</code> some names should not change during renaming,
- whereas others need to be turned into unique names. In this context,
- the two functions <code>RnEnv.newTopSrcBinder</code> and
- <code>RnEnv.newLocals</code> are important:
- </p>
- <blockquote>
- <pre>
-newTopSrcBinder :: Module -> Maybe Name -> Located RdrName -> RnM Name
-newLocalsRn :: [Located RdrName] -> RnM [Name]</pre>
- </blockquote>
- <p>
- The two functions introduces new toplevel and new local names,
- respectively, where the first two arguments to
- <code>newTopSrcBinder</code> determine the currently compiled module and
- the parent construct of the newly defined name. Both functions create
- new names only for <code>RdrName</code>s that are neither exact nor
- original.
- </p>
-
- <h3>Introduction of Toplevel Names: Global RdrName Environment</h3>
- <p>
- A global <code>RdrName</code> environment
- <code>RdrName.GlobalRdrEnv</code> is a map from <code>OccName</code>s to
- lists of qualified names. More precisely, the latter are
- <code>Name</code>s with an associated <code>Provenance</code>:
- </p>
- <blockquote>
- <pre>
-data Provenance
- = LocalDef -- Defined locally
- Module
-
- | Imported -- Imported
- [ImportSpec] -- INVARIANT: non-empty
- Bool -- True iff the thing was named *explicitly*
- -- in *any* of the import specs rather than being
- -- imported as part of a group;
- -- e.g.
- -- import B
- -- import C( T(..) )
- -- Here, everything imported by B, and the constructors of T
- -- are not named explicitly; only T is named explicitly.
- -- This info is used when warning of unused names.</pre>
- </blockquote>
- <p>
- The part of the global <code>RdrName</code> environment for a module
- that contains the local definitions is created by the function
- <code>RnNames.importsFromLocalDecls</code>, which also computes a data
- structure recording all imported declarations in the form of a value of
- type <code>TcRnTypes.ImportAvails</code>.
- </p>
- <p>
- The function <code>importsFromLocalDecls</code>, in turn, makes use of
- <code>RnNames.getLocalDeclBinders :: Module -> HsGroup RdrName -> RnM
- [AvailInfo]</code> to extract all declared names from a binding group,
- where <code>HscTypes.AvailInfo</code> is essentially a collection of
- <code>Name</code>s; i.e., <code>getLocalDeclBinders</code>, on the fly,
- generates <code>Name</code>s from the <code>RdrName</code>s of all
- top-level binders of the module represented by the <code>HsGroup
- RdrName</code> argument.
- </p>
- <p>
- It is important to note that all this happens before the renamer
- actually descends into the toplevel bindings of a module. In other
- words, before <code>TcRnDriver.rnTopSrcDecls</code> performs the
- renaming of a module by way of <code>RnSource.rnSrcDecls</code>, it uses
- <code>importsFromLocalDecls</code> to set up the global
- <code>RdrName</code> environment, which contains <code>Name</code>s for
- all imported <em>and</em> all locally defined toplevel binders. Hence,
- when the helpers of <code>rnSrcDecls</code> come across the
- <em>defining</em> occurrences of a toplevel <code>RdrName</code>, they
- don't rename it by generating a new name, but they simply look up its
- name in the global <code>RdrName</code> environment.
- </p>
-
- <h2>Rebindable syntax</h2>
- <p>
- In Haskell when one writes "3" one gets "fromInteger 3", where
- "fromInteger" comes from the Prelude (regardless of whether the
- Prelude is in scope). If you want to completely redefine numbers,
- that becomes inconvenient. So GHC lets you say
- "-fno-implicit-prelude"; in that case, the "fromInteger" comes from
- whatever is in scope. (This is documented in the User Guide.)
- </p>
- <p>
- This feature is implemented as follows (I always forget).
- <ul>
- <li>Names that are implicitly bound by the Prelude, are marked by the
- type <code>HsExpr.SyntaxExpr</code>. Moreover, the association list
- <code>HsExpr.SyntaxTable</code> is set up by the renamer to map
- rebindable names to the value they are bound to.
- </li>
- <li>Currently, five constructs related to numerals
- (<code>HsExpr.NegApp</code>, <code>HsPat.NPat</code>,
- <code>HsPat.NPlusKPat</code>, <code>HsLit.HsIntegral</code>, and
- <code>HsLit.HsFractional</code>) and
- two constructs related to code>do</code> expressions
- (<code>HsExpr.BindStmt</code> and
- <code>HsExpr.ExprStmt</code>) have rebindable syntax.
- </li>
- <li> When the parser builds these constructs, it puts in the
- built-in Prelude Name (e.g. PrelNum.fromInteger).
- </li>
- <li> When the renamer encounters these constructs, it calls
- <tt>RnEnv.lookupSyntaxName</tt>.
- This checks for <tt>-fno-implicit-prelude</tt>; if not, it just
- returns the same Name; otherwise it takes the occurrence name of the
- Name, turns it into an unqualified RdrName, and looks it up in the
- environment. The returned name is plugged back into the construct.
- </li>
- <li> The typechecker uses the Name to generate the appropriate typing
- constraints.
- </li>
- </ul>
-
- <p><small>
-<!-- hhmts start -->
-Last modified: Wed May 4 17:16:15 EST 2005
-<!-- hhmts end -->
- </small>
- </body>
-</html>
-
diff --git a/docs/comm/the-beast/simplifier.html b/docs/comm/the-beast/simplifier.html
deleted file mode 100644
index 4dbce7765b..0000000000
--- a/docs/comm/the-beast/simplifier.html
+++ /dev/null
@@ -1,86 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - The Mighty Simplifier</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>The GHC Commentary - The Mighty Simplifier</h1>
- <p>
- Most of the optimising program transformations applied by GHC are
- performed on an intermediate language called <em>Core,</em> which
- essentially is a compiler-friendly formulation of rank-2 polymorphic
- lambda terms defined in the module <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/coreSyn/CoreSyn.lhs/"><code>CoreSyn.lhs</code>.</a>
- The transformation engine optimising Core programs is called the
- <em>Simplifier</em> and composed from a couple of modules located in the
- directory <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/simplCore/"><code>fptools/ghc/compiler/simplCore/</code>.</a>
- The main engine of the simplifier is contained in <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/simplCore/Simplify.lhs"><code>Simplify.lhs</code>.</a>
- and its driver is the routine <code>core2core</code> in <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/simplCore/SimplCore.lhs"><code>SimplCore.lhs</code>.</a>
- <p>
- The program that the simplifier has produced after applying its various
- optimisations can be obtained by passing the option
- <code>-ddump-simpl</code> to GHC. Moreover, the various intermediate
- stages of the optimisation process is printed when passing
- <code>-dverbose-core2core</code>.
-
- <h4><a name="loopBreaker">Recursive Definitions</a></h4>
- <p>
- The simplification process has to take special care when handling
- recursive binding groups; otherwise, the compiler might loop.
- Therefore, the routine <code>reOrderRec</code> in <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/simplCore/OccurAnal.lhs"><code>OccurAnal.lhs</code></a>
- computes a set of <em>loop breakers</em> - a set of definitions that
- together cut any possible loop in the binding group. It marks the
- identifiers bound by these definitions as loop breakers by enriching
- their <a href="basicTypes.html#occInfo">occurrence information.</a> Loop
- breakers will <em>never</em> be inlined by the simplifier; thus,
- guaranteeing termination of the simplification procedure. (This is not
- entirely accurate -- see <a href="#rules">rewrite rules</a> below.)
-
- The processes finding loop breakers works as follows: First, the
- strongly connected components (SCC) of the graph representing all
- function dependencies is computed. Then, each SCC is inspected in turn.
- If it contains only a single binding (self-recursive function), this is
- the loop breaker. In case of multiple recursive bindings, the function
- attempts to select bindings where the decision not to inline them does
- cause the least harm - in the sense of inhibiting optimisations in the
- code. This is achieved by considering each binding in turn and awarding
- a <em>score</em> between 0 and 4, where a lower score means that the
- function is less useful for inlining - and thus, a better loop breaker.
- The evaluation of bingings is performed by the function
- <code>score</code> locally defined in <code>OccurAnal</code>.
-
- Note that, because core programs represent function definitions as
- <em>one</em> binding choosing between the possibly many equations in the
- source program with a <code>case</code> construct, a loop breaker cannot
- inline any of its possibly many alternatives (not even the non-recursive
- alternatives).
-
- <h4><a name="rules">Rewrite Rules</a></h4>
- <p>
- The application of rewrite rules is controlled in the module <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/simplCore/Simplify.lhs"><code>Simplify.lhs</code></a>
- by the function <code>completeCall</code>. This function first checks
- whether it should inline the function applied at the currently inspected
- call site, then simplifies the arguments, and finally, checks whether
- any rewrite rule can be applied (and also whether there is a matching
- specialised version of the applied function). The actual check for rule
- application is performed by the function <code><a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/specialise/Rules.lhs">Rules</a>.lookupRule</code>.
- <p>
- It should be note that the application of rewrite rules is not subject
- to the loop breaker check - i.e., rules of loop breakers will be applied
- regardless of whether this may cause the simplifier to diverge.
-
- <p><small>
-<!-- hhmts start -->
-Last modified: Wed Aug 8 19:25:33 EST 2001
-<!-- hhmts end -->
- </small>
- </body>
-</html>
diff --git a/docs/comm/the-beast/stg.html b/docs/comm/the-beast/stg.html
deleted file mode 100644
index 6c9851623a..0000000000
--- a/docs/comm/the-beast/stg.html
+++ /dev/null
@@ -1,164 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - You Got Control: The STG-language</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>The GHC Commentary - You Got Control: The STG-language</h1>
- <p>
- GHC contains two completely independent backends: the byte code
- generator and the machine code generator. The decision over which of
- the two is invoked is made in <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/HscMain.lhs"><code>HscMain</code></a><code>.hscCodeGen</code>.
- The machine code generator proceeds itself in a number of phases: First,
- the <a href="desugar.html">Core</a> intermediate language is translated
- into <em>STG-language</em>; second, STG-language is transformed into a
- GHC-internal variant of <a href="http://www.cminusminus.org/">C--</a>;
- and thirdly, this is either emitted as concrete C--, converted to GNU C,
- or translated to native code (by the <a href="ncg.html">native code
- generator</a> which targets IA32, Sparc, and PowerPC [as of March '5]).
- </p>
- <p>
- In the following, we will have a look at the first step of machine code
- generation, namely the translation steps involving the STG-language.
- Details about the underlying abstract machine, the <em>Spineless Tagless
- G-machine</em>, are in <a
- href="http://research.microsoft.com/copyright/accept.asp?path=/users/simonpj/papers/spineless-tagless-gmachine.ps.gz&pub=34">Implementing
- lazy functional languages on stock hardware: the Spineless Tagless
- G-machine</a>, SL Peyton Jones, Journal of Functional Programming 2(2),
- Apr 1992, pp127-202. (Some details have changed since the publication of
- this article, but it still gives a good introduction to the main
- concepts.)
- </p>
-
- <h2>The STG Language</h2>
- <p>
- The AST of the STG-language and the generation of STG code from Core is
- both located in the <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/"><code>stgSyn/</code></a>
- directory; in the modules <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/StgSyn.lhs"><code>StgSyn</code></a>
- and <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/CoreToStg.lhs"><code>CoreToStg</code></a>,
- respectively.
- </p>
- <p>
- Conceptually, the STG-language is a lambda calculus (including data
- constructors and case expressions) whose syntax is restricted to make
- all control flow explicit. As such, it can be regarded as a variant of
- <em>administrative normal form (ANF).</em> (C.f., <a
- href="http://doi.acm.org/10.1145/173262.155113">The essence of compiling
- with continuations.</a> Cormac Flanagan, Amr Sabry, Bruce F. Duba, and
- Matthias Felleisen. <em>ACM SIGPLAN Conference on Programming Language
- Design and Implementation,</em> ACM Press, 1993.) Each syntactic from
- has a precise operational interpretation, in addition to the
- denotational interpretation inherited from the lambda calculus. The
- concrete representation of the STG language inside GHC also includes
- auxiliary attributes, such as <em>static reference tables (SRTs),</em>
- which determine the top-level bindings referenced by each let binding
- and case expression.
- </p>
- <p>
- As usual in ANF, arguments to functions etc. are restricted to atoms
- (i.e., constants or variables), which implies that all sub-expressions
- are explicitly named and evaluation order is explicit. Specific to the
- STG language is that all let bindings correspond to closure allocation
- (thunks, function closures, and data constructors) and that case
- expressions encode both computation and case selection. There are two
- flavours of case expressions scrutinising boxed and unboxed values,
- respectively. The former perform function calls including demanding the
- evaluation of thunks, whereas the latter execute primitive operations
- (such as arithmetic on fixed size integers and floating-point numbers).
- </p>
- <p>
- The representation of STG language defined in <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/StgSyn.lhs"><code>StgSyn</code></a>
- abstracts over both binders and occurrences of variables. The type names
- involved in this generic definition all carry the prefix
- <code>Gen</code> (such as in <code>GenStgBinding</code>). Instances of
- these generic definitions, where both binders and occurrences are of type
- <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/Id.lhs"><code>Id</code></a><code>.Id</code>
- are defined as type synonyms and use type names that drop the
- <code>Gen</code> prefix (i.e., becoming plain <code>StgBinding</code>).
- Complete programs in STG form are represented by values of type
- <code>[StgBinding]</code>.
- </p>
-
- <h2>From Core to STG</h2>
- <p>
- Although, the actual translation from Core AST into STG AST is performed
- by the function <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/CoreToStg.lhs"><code>CoreToStg</code></a><code>.coreToStg</code>
- (or <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/CoreToStg.lhs"><code>CoreToStg</code></a><code>.coreExprToStg</code>
- for individual expressions), the translation crucial depends on <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/coreSyn/CorePrep.lhs"><code>CorePrep</code></a><code>.corePrepPgm</code>
- (resp. <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/coreSyn/CorePrep.lhs"><code>CorePrep</code></a><code>.corePrepExpr</code>),
- which prepares Core code for code generation (for both byte code and
- machine code generation). <code>CorePrep</code> saturates primitive and
- constructor applications, turns the code into A-normal form, renames all
- identifiers into globally unique names, generates bindings for
- constructor workers, constructor wrappers, and record selectors plus
- some further cleanup.
- </p>
- <p>
- In other words, after Core code is prepared for code generation it is
- structurally already in the form required by the STG language. The main
- work performed by the actual transformation from Core to STG, as
- performed by <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/CoreToStg.lhs"><code>CoreToStg</code></a><code>.coreToStg</code>,
- is to compute the live and free variables as well as live CAFs (constant
- applicative forms) at each let binding and case alternative. In
- subsequent phases, the live CAF information is used to compute SRTs.
- The live variable information is used to determine which stack slots
- need to be zapped (to avoid space leaks) and the free variable
- information is need to construct closures. Moreover, hints for
- optimised code generation are computed, such as whether a closure needs
- to be updated after is has been evaluated.
- </p>
-
- <h2>STG Passes</h2>
- <p>
- These days little actual work is performed on programs in STG form; in
- particular, the code is not further optimised. All serious optimisation
- (except low-level optimisations which are performed during native code
- generation) has already been done on Core. The main task of <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/CoreToStg.lhs"><code>CoreToStg</code></a><code>.stg2stg</code>
- is to compute SRTs from the live CAF information determined during STG
- generation. Other than that, <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/profiling/SCCfinal.lhs"><code>SCCfinal</code></a><code>.stgMassageForProfiling</code>
- is executed when compiling for profiling and information may be dumped
- for debugging purposes.
- </p>
-
- <h2>Towards C--</h2>
- <p>
- GHC's internal form of C-- is defined in the module <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/cmm/Cmm.hs"><code>Cmm</code></a>.
- The definition is generic in that it abstracts over the type of static
- data and of the contents of basic blocks (i.e., over the concrete
- representation of constant data and instructions). These generic
- definitions have names carrying the prefix <code>Gen</code> (such as
- <code>GenCmm</code>). The same module also instantiates the generic
- form to a concrete form where data is represented by
- <code>CmmStatic</code> and instructions are represented by
- <code>CmmStmt</code> (giving us, e.g., <code>Cmm</code> from
- <code>GenCmm</code>). The concrete form more or less follows the
- external <a href="http://www.cminusminus.org/">C--</a> language.
- </p>
- <p>
- Programs in STG form are translated to <code>Cmm</code> by <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/codeGen/CodeGen.lhs"><code>CodeGen</code></a><code>.codeGen</code>
- </p>
-
- <p><hr><small>
-<!-- hhmts start -->
-Last modified: Sat Mar 5 22:55:25 EST 2005
-<!-- hhmts end -->
- </small>
- </body>
-</html>
diff --git a/docs/comm/the-beast/syntax.html b/docs/comm/the-beast/syntax.html
deleted file mode 100644
index be5bbefa17..0000000000
--- a/docs/comm/the-beast/syntax.html
+++ /dev/null
@@ -1,99 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - Just Syntax</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>The GHC Commentary - Just Syntax</h1>
- <p>
- The lexical and syntactic analyser for Haskell programs are located in
- <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/"><code>fptools/ghc/compiler/parser/</code></a>.
- </p>
-
- <h2>The Lexer</h2>
- <p>
- The lexer is a rather tedious piece of Haskell code contained in the
- module <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/Lex.lhs"><code>Lex</code></a>.
- Its complexity partially stems from covering, in addition to Haskell 98,
- also the whole range of GHC language extensions plus its ability to
- analyse interface files in addition to normal Haskell source. The lexer
- defines a parser monad <code>P a</code>, where <code>a</code> is the
- type of the result expected from a successful parse. More precisely, a
- result of type
-<blockquote><pre>
-data ParseResult a = POk PState a
- | PFailed Message</pre>
-</blockquote>
- <p>
- is produced with <code>Message</code> being from <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/ErrUtils.lhs"><code>ErrUtils</code></a>
- (and currently is simply a synonym for <code>SDoc</code>).
- <p>
- The record type <code>PState</code> contains information such as the
- current source location, buffer state, contexts for layout processing,
- and whether Glasgow extensions are accepted (either due to
- <code>-fglasgow-exts</code> or due to reading an interface file). Most
- of the fields of <code>PState</code> store unboxed values; in fact, even
- the flag indicating whether Glasgow extensions are enabled is
- represented by an unboxed integer instead of by a <code>Bool</code>. My
- (= chak's) guess is that this is to avoid having to perform a
- <code>case</code> on a boxed value in the inner loop of the lexer.
- <p>
- The same lexer is used by the Haskell source parser, the Haskell
- interface parser, and the package configuration parser.
-
- <h2>The Haskell Source Parser</h2>
- <p>
- The parser for Haskell source files is defined in the form of a parser
- specification for the parser generator <a
- href="http://haskell.org/happy/">Happy</a> in the file <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/Parser.y"><code>Parser.y</code></a>.
- The parser exports three entry points for parsing entire modules
- (<code>parseModule</code>, individual statements
- (<code>parseStmt</code>), and individual identifiers
- (<code>parseIdentifier</code>), respectively. The last two are needed
- for GHCi. All three require a parser state (of type
- <code>PState</code>) and are invoked from <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/HscMain.lhs"><code>HscMain</code></a>.
- <p>
- Parsing of Haskell is a rather involved process. The most challenging
- features are probably the treatment of layout and expressions that
- contain infix operators. The latter may be user-defined and so are not
- easily captured in a static syntax specification. Infix operators may
- also appear in the right hand sides of value definitions, and so, GHC's
- parser treats those in the same way as expressions. In other words, as
- general expressions are a syntactic superset of expressions - ok, they
- <em>nearly</em> are - the parser simply attempts to parse a general
- expression in such positions. Afterwards, the generated parse tree is
- inspected to ensure that the accepted phrase indeed forms a legal
- pattern. This and similar checks are performed by the routines from <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/ParseUtil.lhs"><code>ParseUtil</code></a>. In
- some cases, these routines do, in addition to checking for
- wellformedness, also transform the parse tree, such that it fits into
- the syntactic context in which it has been parsed; in fact, this happens
- for patterns, which are transformed from a representation of type
- <code>RdrNameHsExpr</code> into a representation of type
- <code>RdrNamePat</code>.
-
- <h2>The Haskell Interface Parser</h2>
- <p>
- The parser for interface files is also generated by Happy from <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/rename/ParseIface.y"><code>ParseIface.y</code></a>.
- It's main routine <code>parseIface</code> is invoked from <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/rename/RnHiFiles.lhs"><code>RnHiFiles</code></a><code>.readIface</code>.
-
- <h2>The Package Configuration Parser</h2>
- <p>
- The parser for configuration files is by far the smallest of the three
- and defined in <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/ParsePkgConf.y"><code>ParsePkgConf.y</code></a>.
- It exports <code>loadPackageConfig</code>, which is used by <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/DriverState.hs"><code>DriverState</code></a><code>.readPackageConf</code>.
-
- <p><small>
-<!-- hhmts start -->
-Last modified: Wed Jan 16 00:30:14 EST 2002
-<!-- hhmts end -->
- </small>
- </body>
-</html>
diff --git a/docs/comm/the-beast/typecheck.html b/docs/comm/the-beast/typecheck.html
deleted file mode 100644
index 482a447628..0000000000
--- a/docs/comm/the-beast/typecheck.html
+++ /dev/null
@@ -1,316 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - Checking Types</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>The GHC Commentary - Checking Types</h1>
- <p>
- Probably the most important phase in the frontend is the type checker,
- which is located at <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/"><code>fptools/ghc/compiler/typecheck/</code>.</a>
- GHC type checks programs in their original Haskell form before the
- desugarer converts them into Core code. This complicates the type
- checker as it has to handle the much more verbose Haskell AST, but it
- improves error messages, as those message are based on the same
- structure that the user sees.
- </p>
- <p>
- GHC defines the abstract syntax of Haskell programs in <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/hsSyn/HsSyn.lhs"><code>HsSyn</code></a>
- using a structure that abstracts over the concrete representation of
- bound occurrences of identifiers and patterns. The module <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcHsSyn.lhs"><code>TcHsSyn</code></a>
- defines a number of helper function required by the type checker. Note
- that the type <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcRnTypes.lhs"><code>TcRnTypes</code></a>.<code>TcId</code>
- used to represent identifiers in some signatures during type checking
- is, in fact, nothing but a synonym for a <a href="vars.html">plain
- <code>Id</code>.</a>
- </p>
- <p>
- It is also noteworthy, that the representations of types changes during
- type checking from <code>HsType</code> to <code>TypeRep.Type</code>.
- The latter is a <a href="types.html">hybrid type representation</a> that
- is used to type Core, but still contains sufficient information to
- recover source types. In particular, the type checker maintains and
- compares types in their <code>Type</code> form.
- </p>
-
- <h2>The Overall Flow of Things</h2>
-
- <h4>Entry Points Into the Type Checker</h4>
- <p>
- The interface of the type checker (and <a
- href="renamer.html">renamer</a>) to the rest of the compiler is provided
- by <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcRnDriver.lhs"><code>TcRnDriver</code></a>.
- Entire modules are processed by calling <code>tcRnModule</code> and GHCi
- uses <code>tcRnStmt</code>, <code>tcRnExpr</code>, and
- <code>tcRnType</code> to typecheck statements and expressions, and to
- kind check types, respectively. Moreover, <code>tcRnExtCore</code> is
- provided to typecheck external Core code. Moreover,
- <code>tcTopSrcDecls</code> is used by Template Haskell - more
- specifically by <code>TcSplice.tc_bracket</code>
- - to type check the contents of declaration brackets.
- </p>
-
- <h4>Renaming and Type Checking a Module</h4>
- <p>
- The function <code>tcRnModule</code> controls the complete static
- analysis of a Haskell module. It sets up the combined renamer and type
- checker monad, resolves all import statements, initiates the actual
- renaming and type checking process, and finally, wraps off by processing
- the export list.
- </p>
- <p>
- The actual type checking and renaming process is initiated via
- <code>TcRnDriver.tcRnSrcDecls</code>, which uses a helper called
- <code>tc_rn_src_decls</code> to implement the iterative renaming and
- type checking process required by <a href="../exts/th.html">Template
- Haskell</a>. However, before it invokes <code>tc_rn_src_decls</code>,
- it takes care of hi-boot files; afterwards, it simplifies type
- constraints and zonking (see below regarding the later).
- </p>
- <p>
- The function <code>tc_rn_src_decls</code> partitions static analysis of
- a whole module into multiple rounds, where the initial round is followed
- by an additional one for each toplevel splice. It collects all
- declarations up to the next splice into an <code>HsDecl.HsGroup</code>
- to rename and type check that <em>declaration group</em> by calling
- <code>TcRnDriver.tcRnGroup</code>. Afterwards, it executes the
- splice (if there are any left) and proceeds to the next group, which
- includes the declarations produced by the splice.
- </p>
- <p>
- The function <code>tcRnGroup</code>, finally, gets down to invoke the
- actual renaming and type checking via
- <code>TcRnDriver.rnTopSrcDecls</code> and
- <code>TcRnDriver.tcTopSrcDecls</code>, respectively. The renamer, apart
- from renaming, computes the global type checking environment, of type
- <code>TcRnTypes.TcGblEnv</code>, which is stored in the type checking
- monad before type checking commences.
- </p>
-
- <h2>Type Checking a Declaration Group</h2>
- <p>
- The type checking of a declaration group, performed by
- <code>tcTopSrcDecls</code> starts by processing of the type and class
- declarations of the current module, using the function
- <code>TcTyClsDecls.tcTyAndClassDecls</code>. This is followed by a
- first round over instance declarations using
- <code>TcInstDcls.tcInstDecls1</code>, which in particular generates all
- additional bindings due to the deriving process. Then come foreign
- import declarations (<code>TcForeign.tcForeignImports</code>) and
- default declarations (<code>TcDefaults.tcDefaults</code>).
- </p>
- <p>
- Now, finally, toplevel value declarations (including derived ones) are
- type checked using <code>TcBinds.tcTopBinds</code>. Afterwards,
- <code>TcInstDcls.tcInstDecls2</code> traverses instances for the second
- time. Type checking concludes with processing foreign exports
- (<code>TcForeign.tcForeignExports</code>) and rewrite rules
- (<code>TcRules.tcRules</code>). Finally, the global environment is
- extended with the new bindings.
- </p>
-
- <h2>Type checking Type and Class Declarations</h2>
- <p>
- Type and class declarations are type checked in a couple of phases that
- contain recursive dependencies - aka <em>knots.</em> The first knot
- encompasses almost the whole type checking of these declarations and
- forms the main piece of <code>TcTyClsDecls.tcTyAndClassDecls</code>.
- </p>
- <p>
- Inside this big knot, the first main operation is kind checking, which
- again involves a knot. It is implemented by <code>kcTyClDecls</code>,
- which performs kind checking of potentially recursively-dependent type
- and class declarations using kind variables for initially unknown kinds.
- During processing the individual declarations some of these variables
- will be instantiated depending on the context; the rest gets by default
- kind <code>*</code> (during <em>zonking</em> of the kind signatures).
- Type synonyms are treated specially in this process, because they can
- have an unboxed type, but they cannot be recursive. Hence, their kinds
- are inferred in dependency order. Moreover, in contrast to class
- declarations and other type declarations, synonyms are not entered into
- the global environment as a global <code>TyThing</code>.
- (<code>TypeRep.TyThing</code> is a sum type that combines the various
- flavours of typish entities, such that they can be stuck into type
- environments and similar.)
- </p>
-
- <h2>More Details</h2>
-
- <h4>Types Variables and Zonking</h4>
- <p>
- During type checking type variables are represented by mutable variables
- - cf. the <a href="vars.html#TyVar">variable story.</a> Consequently,
- unification can instantiate type variables by updating those mutable
- variables. This process of instantiation is (for reasons that elude me)
- called <a
- href="http://www.dictionary.com/cgi-bin/dict.pl?term=zonk&db=*">zonking</a>
- in GHC's sources. The zonking routines for the various forms of Haskell
- constructs are responsible for most of the code in the module <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcHsSyn.lhs"><code>TcHsSyn</code>,</a>
- whereas the routines that actually operate on mutable types are defined
- in <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcMType.lhs"><code>TcMType</code></a>;
- this includes the zonking of type variables and type terms, routines to
- create mutable structures and update them as well as routines that check
- constraints, such as that type variables in function signatures have not
- been instantiated during type checking. The actual type unification
- routine is <code>uTys</code> in the module <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcUnify.lhs"><code>TcUnify</code></a>.
- </p>
- <p>
- All type variables that may be instantiated (those in signatures
- may not), but haven't been instantiated during type checking, are zonked
- to <code>()</code>, so that after type checking all mutable variables
- have been eliminated.
- </p>
-
- <h4>Type Representation</h4>
- <p>
- The representation of types is fixed in the module <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcRep.lhs"><code>TcRep</code></a>
- and exported as the data type <code>Type</code>. As explained in <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcType.lhs"><code>TcType</code></a>,
- GHC supports rank-N types, but, in the type checker, maintains the
- restriction that type variables cannot be instantiated to quantified
- types (i.e., the type system is predicative). The type checker floats
- universal quantifiers outside and maintains types in prenex form.
- (However, quantifiers can, of course, not float out of negative
- positions.) Overall, we have
- </p>
- <blockquote>
- <pre>
-sigma -> forall tyvars. phi
-phi -> theta => rho
-rho -> sigma -> rho
- | tau
-tau -> tyvar
- | tycon tau_1 .. tau_n
- | tau_1 tau_2
- | tau_1 -> tau_2</pre>
- </blockquote>
- <p>
- where <code>sigma</code> is in prenex form; i.e., there is never a
- forall to the right of an arrow in a <code>phi</code> type. Moreover, a
- type of the form <code>tau</code> never contains a quantifier (which
- includes arguments to type constructors).
- </p>
- <p>
- Of particular interest are the variants <code>SourceTy</code> and
- <code>NoteTy</code> of <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TypeRep.lhs"><code>TypeRep</code></a>.<code>Type</code>.
- The constructor <code>SourceTy :: SourceType -> Type</code> represents a
- type constraint; that is, a predicate over types represented by a
- dictionary. The type checker treats a <code>SourceTy</code> as opaque,
- but during the translation to core it will be expanded into its concrete
- representation (i.e., a dictionary type) by the function <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/types/Type.lhs"><code>Type</code></a>.<code>sourceTypeRep</code>.
- Note that newtypes are not covered by <code>SourceType</code>s anymore,
- even if some comments in GHC still suggest this. Instead, all newtype
- applications are initially represented as a <code>NewTcApp</code>, until
- they are eliminated by calls to <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/types/Type.lhs"><code>Type</code></a>.<code>newTypeRep</code>.
- </p>
- <p>
- The <code>NoteTy</code> constructor is used to add non-essential
- information to a type term. Such information has the type
- <code>TypeRep.TyNote</code> and is either the set of free type variables
- of the annotated expression or the unexpanded version of a type synonym.
- Free variables sets are cached as notes to save the overhead of
- repeatedly computing the same set for a given term. Unexpanded type
- synonyms are useful for generating comprehensible error messages, but
- have no influence on the process of type checking.
- </p>
-
- <h4>Type Checking Environment</h4>
- <p>
- During type checking, GHC maintains a <em>type environment</em> whose
- type definitions are fixed in the module <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcRnTypes.lhs"><code>TcRnTypes</code></a> with the operations defined in
-<a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcEnv.lhs"><code>TcEnv</code></a>.
- Among other things, the environment contains all imported and local
- instances as well as a list of <em>global</em> entities (imported and
- local types and classes together with imported identifiers) and
- <em>local</em> entities (locally defined identifiers). This environment
- is threaded through the type checking monad, whose support functions
- including initialisation can be found in the module <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcRnMonad.lhs"><code>TcRnMonad</code>.</a>
-
- <h4>Expressions</h4>
- <p>
- Expressions are type checked by <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcExpr.lhs"><code>TcExpr</code>.</a>
- <p>
- Usage occurrences of identifiers are processed by the function
- <code>tcId</code> whose main purpose is to <a href="#inst">instantiate
- overloaded identifiers.</a> It essentially calls
- <code>TcInst.instOverloadedFun</code> once for each universally
- quantified set of type constraints. It should be noted that overloaded
- identifiers are replaced by new names that are first defined in the LIE
- (Local Instance Environment?) and later promoted into top-level
- bindings.
-
- <h4><a name="inst">Handling of Dictionaries and Method Instances</a></h4>
- <p>
- GHC implements overloading using so-called <em>dictionaries.</em> A
- dictionary is a tuple of functions -- one function for each method in
- the class of which the dictionary implements an instance. During type
- checking, GHC replaces each type constraint of a function with one
- additional argument. At runtime, the extended function gets passed a
- matching class dictionary by way of these additional arguments.
- Whenever the function needs to call a method of such a class, it simply
- extracts it from the dictionary.
- <p>
- This sounds simple enough; however, the actual implementation is a bit
- more tricky as it wants to keep track of all the instances at which
- overloaded functions are used in a module. This information is useful
- to optimise the code. The implementation is the module <a
- href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/Inst.lhs"><code>Inst.lhs</code>.</a>
- <p>
- The function <code>instOverloadedFun</code> is invoked for each
- overloaded usage occurrence of an identifier, where overloaded means that
- the type of the idendifier contains a non-trivial type constraint. It
- proceeds in two steps: (1) Allocation of a method instance
- (<code>newMethodWithGivenTy</code>) and (2) instantiation of functional
- dependencies. The former implies allocating a new unique identifier,
- which replaces the original (overloaded) identifier at the currently
- type-checked usage occurrence.
- <p>
- The new identifier (after being threaded through the LIE) eventually
- will be bound by a top-level binding whose rhs contains a partial
- application of the original overloaded identifier. This papp applies
- the overloaded function to the dictionaries needed for the current
- instance. In GHC lingo, this is called a <em>method.</em> Before
- becoming a top-level binding, the method is first represented as a value
- of type <code>Inst.Inst</code>, which makes it easy to fold multiple
- instances of the same identifier at the same types into one global
- definition. (And probably other things, too, which I haven't
- investigated yet.)
-
- <p>
- <strong>Note:</strong> As of 13 January 2001 (wrt. to the code in the
- CVS HEAD), the above mechanism interferes badly with RULES pragmas
- defined over overloaded functions. During instantiation, a new name is
- created for an overloaded function partially applied to the dictionaries
- needed in a usage position of that function. As the rewrite rule,
- however, mentions the original overloaded name, it won't fire anymore
- -- unless later phases remove the intermediate definition again. The
- latest CVS version of GHC has an option
- <code>-fno-method-sharing</code>, which avoids sharing instantiation
- stubs. This is usually/often/sometimes sufficient to make the rules
- fire again.
-
- <p><small>
-<!-- hhmts start -->
-Last modified: Thu May 12 22:52:46 EST 2005
-<!-- hhmts end -->
- </small>
- </body>
-</html>
diff --git a/docs/comm/the-beast/types.html b/docs/comm/the-beast/types.html
deleted file mode 100644
index 383b71f054..0000000000
--- a/docs/comm/the-beast/types.html
+++ /dev/null
@@ -1,179 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - Hybrid Types</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>The GHC Commentary - Hybrid Types</h1>
- <p>
- GHC essentially supports two type systems: (1) the <em>source type
- system</em> (which is a heavily extended version of the type system of
- Haskell 98) and (2) the <em>Core type system,</em> which is the type system
- used by the intermediate language (see also <a
- href="desugar.html">Sugar Free: From Haskell To Core</a>).
- </p>
- <p>
- During parsing and renaming, type information is represented in a form
- that is very close to Haskell's concrete syntax; it is defined by
- <code>HsTypes.HsType</code>. In addition, type, class, and instance
- declarations are maintained in their source form as defined in the
- module <code>HsDecl</code>. The situation changes during type checking,
- where types are translated into a second representation, which is
- defined in the module <code>types/TypeRep.lhs</code>, as type
- <code>Type</code>. This second representation is peculiar in that it is
- a hybrid between the source representation of types and the Core
- representation of types. Using functions, such as
- <code>Type.coreView</code> and <code>Type.deepCoreView</code>, a value
- of type <code>Type</code> exhibits its Core representation. On the
- other hand, pretty printing a <code>Type</code> with
- <code>TypeRep.pprType</code> yields the type's source representation.
- </p>
- <p>
- In fact, the <a href="typecheck.html">type checker</a> maintains type
- environments based on <code>Type</code>, but needs to perform type
- checking on source-level types. As a result, we have functions
- <code>Type.tcEqType</code> and <code>Type.tcCmpType</code>, which
- compare types based on their source representation, as well as the
- function <code>coreEqType</code>, which compares them based on their
- core representation. The latter is needed during type checking of Core
- (as performed by the functions in the module
- <code>coreSyn/CoreLint.lhs</code>).
- </p>
-
- <h2>Type Synonyms</h2>
- <p>
- Type synonyms in Haskell are essentially a form of macro definitions on
- the type level. For example, when the type checker compares two type
- terms, synonyms are always compared in their expanded form. However, to
- produce good error messages, we like to avoid expanding type synonyms
- during pretty printing. Hence, <code>Type</code> has a variant
- <code>NoteTy TyNote Type</code>, where
- </p>
- <blockquote>
- <pre>
-data TyNote
- = FTVNote TyVarSet -- The free type variables of the noted expression
-
- | SynNote Type -- Used for type synonyms
- -- The Type is always a TyConApp, and is the un-expanded form.
- -- The type to which the note is attached is the expanded form.</pre>
- </blockquote>
- <p>
- In other words, a <code>NoteTy</code> represents the expanded form of a
- type synonym together with a note stating its source form.
- </p>
-
- <h3>Creating Representation Types of Synonyms</h3>
- <p>
- During translation from <code>HsType</code> to <code>Type</code> the
- function <code>Type.mkSynTy</code> is used to construct representations
- of applications of type synonyms. It creates a <code>NoteTy</code> node
- if the synonym is applied to a sufficient number of arguments;
- otherwise, it builds a simple <code>TyConApp</code> and leaves it to
- <code>TcMType.checkValidType</code> to pick up invalid unsaturated
- synonym applications. While creating a <code>NoteTy</code>,
- <code>mkSynTy</code> also expands the synonym by substituting the type
- arguments for the parameters of the synonym definition, using
- <code>Type.substTyWith</code>.
- </p>
- <p>
- The function <code>mkSynTy</code> is used indirectly via
- <code>mkGenTyConApp</code>, <code>mkAppTy</code>, and
- <code>mkAppTy</code>, which construct type representations involving
- type applications. The function <code>mkSynTy</code> is also used
- directly during type checking interface files; this is for tedious
- reasons to do with forall hoisting - see the comment at
- <code>TcIface.mkIfTcApp</code>.
- </p>
-
- <h2>Newtypes</h2>
- <p>
- Data types declared by a <code>newtype</code> declarations constitute new
- type constructors---i.e., they are not just type macros, but introduce
- new type names. However, provided that a newtype is not recursive, we
- still want to implement it by its representation type. GHC realises this
- by providing two flavours of type equality: (1) <code>tcEqType</code> is
- source-level type equality, which compares newtypes and
- <code>PredType</code>s by name, and (2) <code>coreEqType</code> compares
- them structurally (by using <code>deepCoreView</code> to expand the
- representation before comparing). The function
- <code>deepCoreView</code> (via <code>coreView</code>) invokes
- <code>expandNewTcApp</code> for every type constructor application
- (<code>TyConApp</code>) to determine whether we are looking at a newtype
- application that needs to be expanded to its representation type.
- </p>
-
- <h2>Predicates</h2>
- <p>
- The dictionary translation of type classes, translates each predicate in
- a type context of a type signature into an additional argument, which
- carries a dictionary with the functions overloaded by the corresponding
- class. The <code>Type</code> data type has a special variant
- <code>PredTy PredType</code> for predicates, where
- </p>
- <blockquote>
- <pre>
-data PredType
- = ClassP Class [Type] -- Class predicate
- | IParam (IPName Name) Type -- Implicit parameter</pre>
- </blockquote>
- <p>
- These types need to be handled as source type during type checking, but
- turn into their representations when inspected through
- <code>coreView</code>. The representation is determined by
- <code>Type.predTypeRep</code>.
- </p>
-
- <h2>Representation of Type Constructors</h2>
- <p>
- Type constructor applications are represented in <code>Type</code> by
- the variant <code>TyConApp :: TyCon -> [Type] -> Type</code>. The first
- argument to <code>TyConApp</code>, namely <code>TyCon.TyCon</code>,
- distinguishes between function type constructors (variant
- <code>FunTyCon</code>) and algebraic type constructors (variant
- <code>AlgTyCon</code>), which arise from data and newtype declarations.
- The variant <code>AlgTyCon</code> contains all the information available
- from the data/newtype declaration as well as derived information, such
- as the <code>Unique</code> and argument variance information. This
- includes a field <code>algTcRhs :: AlgTyConRhs</code>, where
- <code>AlgTyConRhs</code> distinguishes three kinds of algebraic data
- type declarations: (1) declarations that have been exported abstractly,
- (2) <code>data</code> declarations, and (3) <code>newtype</code>
- declarations. The last two both include their original right hand side;
- in addition, the third variant also caches the "ultimate" representation
- type, which is the right hand side after expanding all type synonyms and
- non-recursive newtypes.
- </p>
- <p>
- Both data and newtype declarations refer to their data constructors
- represented as <code>DataCon.DataCon</code>, which include all details
- of their signature (as derived from the original declaration) as well
- information for code generation, such as their tag value.
- </p>
-
- <h2>Representation of Classes and Instances</h2>
- <p>
- Class declarations turn into values of type <code>Class.Class</code>.
- They represent methods as the <code>Id</code>s of the dictionary
- selector functions. Similar selector functions are available for
- superclass dictionaries.
- </p>
- <p>
- Instance declarations turn into values of type
- <code>InstEnv.Instance</code>, which in interface files are represented
- as <code>IfaceSyn.IfaceInst</code>. Moreover, the type
- <code>InstEnv.InstEnv</code>, which is a synonym for <code>UniqFM
- ClsInstEnv</code>, provides a mapping of classes to their
- instances---<code>ClsInstEnv</code> is essentially a list of instance
- declarations.
- </p>
-
- <p><small>
-<!-- hhmts start -->
-Last modified: Sun Jun 19 13:07:22 EST 2005
-<!-- hhmts end -->
- </small></p>
- </body>
-</html>
diff --git a/docs/comm/the-beast/vars.html b/docs/comm/the-beast/vars.html
deleted file mode 100644
index 9bbd310c60..0000000000
--- a/docs/comm/the-beast/vars.html
+++ /dev/null
@@ -1,235 +0,0 @@
-<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
-<html>
- <head>
- <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
- <title>The GHC Commentary - The Real Story about Variables, Ids, TyVars, and the like</title>
- </head>
-
- <body BGCOLOR="FFFFFF">
- <h1>The GHC Commentary - The Real Story about Variables, Ids, TyVars, and the like</h1>
- <p>
-
-
-<h2>Variables</h2>
-
-The <code>Var</code> type, defined in <code>basicTypes/Var.lhs</code>,
-represents variables, both term variables and type variables:
-<pre>
- data Var
- = Var {
- varName :: Name,
- realUnique :: FastInt,
- varType :: Type,
- varDetails :: VarDetails,
- varInfo :: IdInfo
- }
-</pre>
-<ul>
-<li> The <code>varName</code> field contains the identity of the variable:
-its unique number, and its print-name. See "<a href="names.html">The truth about names</a>".
-
-<p><li> The <code>realUnique</code> field caches the unique number in the
-<code>varName</code> field, just to make comparison of <code>Var</code>s a little faster.
-
-<p><li> The <code>varType</code> field gives the type of a term variable, or the kind of a
-type variable. (Types and kinds are both represented by a <code>Type</code>.)
-
-<p><li> The <code>varDetails</code> field distinguishes term variables from type variables,
-and makes some further distinctions (see below).
-
-<p><li> For term variables (only) the <code>varInfo</code> field contains lots of useful
-information: strictness, unfolding, etc. However, this information is all optional;
-you can always throw away the <code>IdInfo</code>. In contrast, you can't safely throw away
-the <code>VarDetails</code> of a <code>Var</code>
-</ul>
-<p>
-It's often fantastically convenient to have term variables and type variables
-share a single data type. For example,
-<pre>
- exprFreeVars :: CoreExpr -> VarSet
-</pre>
-If there were two types, we'd need to return two sets. Simiarly, big lambdas and
-little lambdas use the same constructor in Core, which is extremely convenient.
-<p>
-We define a couple of type synonyms:
-<pre>
- type Id = Var -- Term variables
- type TyVar = Var -- Type variables
-</pre>
-just to help us document the occasions when we are expecting only term variables,
-or only type variables.
-
-
-<h2> The <code>VarDetails</code> field </h2>
-
-The <code>VarDetails</code> field tells what kind of variable this is:
-<pre>
-data VarDetails
- = LocalId -- Used for locally-defined Ids (see NOTE below)
- LocalIdDetails
-
- | GlobalId -- Used for imported Ids, dict selectors etc
- GlobalIdDetails
-
- | TyVar
- | MutTyVar (IORef (Maybe Type)) -- Used during unification;
- TyVarDetails
-</pre>
-
-<a name="TyVar">
-<h2>Type variables (<code>TyVar</code>)</h2>
-</a>
-<p>
-The <code>TyVar</code> case is self-explanatory. The <code>MutTyVar</code>
-case is used only during type checking. Then a type variable can be unified,
-using an imperative update, with a type, and that is what the
-<code>IORef</code> is for. The <code>TcType.TyVarDetails</code> field records
-the sort of type variable we are dealing with. It is defined as
-<pre>
-data TyVarDetails = SigTv | ClsTv | InstTv | VanillaTv
-</pre>
-<code>SigTv</code> marks type variables that were introduced when
-instantiating a type signature prior to matching it against the inferred type
-of a definition. The variants <code>ClsTv</code> and <code>InstTv</code> mark
-scoped type variables introduced by class and instance heads, respectively.
-These first three sorts of type variables are skolem variables (tested by the
-predicate <code>isSkolemTyVar</code>); i.e., they must <em>not</em> be
-instantiated. All other type variables are marked as <code>VanillaTv</code>.
-<p>
-For a long time I tried to keep mutable Vars statically type-distinct
-from immutable Vars, but I've finally given up. It's just too painful.
-After type checking there are no MutTyVars left, but there's no static check
-of that fact.
-
-<h2>Term variables (<code>Id</code>)</h2>
-
-A term variable (of type <code>Id</code>) is represented either by a
-<code>LocalId</code> or a <code>GlobalId</code>:
-<p>
-A <code>GlobalId</code> is
-<ul>
-<li> Always bound at top-level.
-<li> Always has a <code>GlobalName</code>, and hence has
- a <code>Unique</code> that is globally unique across the whole
- GHC invocation (a single invocation may compile multiple modules).
-<li> Has <code>IdInfo</code> that is absolutely fixed, forever.
-</ul>
-
-<p>
-A <code>LocalId</code> is:
-<ul>
-<li> Always bound in the module being compiled:
-<ul>
-<li> <em>either</em> bound within an expression (lambda, case, local let(rec))
-<li> <em>or</em> defined at top level in the module being compiled.
-</ul>
-<li> Has IdInfo that changes as the simpifier bashes repeatedly on it.
-</ul>
-<p>
-The key thing about <code>LocalId</code>s is that the free-variable finder
-typically treats them as candidate free variables. That is, it ignores
-<code>GlobalId</code>s such as imported constants, data contructors, etc.
-<p>
-An important invariant is this: <em>All the bindings in the module
-being compiled (whether top level or not) are <code>LocalId</code>s
-until the CoreTidy phase.</em> In the CoreTidy phase, all
-externally-visible top-level bindings are made into GlobalIds. This
-is the point when a <code>LocalId</code> becomes "frozen" and becomes
-a fixed, immutable <code>GlobalId</code>.
-<p>
-(A binding is <em>"externally-visible"</em> if it is exported, or
-mentioned in the unfolding of an externally-visible Id. An
-externally-visible Id may not have an unfolding, either because it is
-too big, or because it is the loop-breaker of a recursive group.)
-
-<h3>Global Ids and implicit Ids</h3>
-
-<code>GlobalId</code>s are further categorised by their <code>GlobalIdDetails</code>.
-This type is defined in <code>basicTypes/IdInfo</code>, because it mentions other
-structured types like <code>DataCon</code>. Unfortunately it is *used* in <code>Var.lhs</code>
-so there's a <code>hi-boot</code> knot to get it there. Anyway, here's the declaration:
-<pre>
-data GlobalIdDetails
- = NotGlobalId -- Used as a convenient extra return value
- -- from globalIdDetails
-
- | VanillaGlobal -- Imported from elsewhere
-
- | PrimOpId PrimOp -- The Id for a primitive operator
- | FCallId ForeignCall -- The Id for a foreign call
-
- -- These next ones are all "implicit Ids"
- | RecordSelId FieldLabel -- The Id for a record selector
- | DataConId DataCon -- The Id for a data constructor *worker*
- | DataConWrapId DataCon -- The Id for a data constructor *wrapper*
- -- [the only reasons we need to know is so that
- -- a) we can suppress printing a definition in the interface file
- -- b) when typechecking a pattern we can get from the
- -- Id back to the data con]
-</pre>
-The <code>GlobalIdDetails</code> allows us to go from the <code>Id</code> for
-a record selector, say, to its field name; or the <code>Id</code> for a primitive
-operator to the <code>PrimOp</code> itself.
-<p>
-Certain <code>GlobalId</code>s are called <em>"implicit"</em> Ids. An implicit
-Id is derived by implication from some other declaration. So a record selector is
-derived from its data type declaration, for example. An implicit Ids is always
-a <code>GlobalId</code>. For most of the compilation, the implicit Ids are just
-that: implicit. If you do -ddump-simpl you won't see their definition. (That's
-why it's true to say that until CoreTidy all Ids in this compilation unit are
-LocalIds.) But at CorePrep, a binding is added for each implicit Id defined in
-this module, so that the code generator will generate code for the (curried) function.
-<p>
-Implicit Ids carry their unfolding inside them, of course, so they may well have
-been inlined much earlier; but we generate the curried top-level defn just in
-case its ever needed.
-
-<h3>LocalIds</h3>
-
-The <code>LocalIdDetails</code> gives more info about a <code>LocalId</code>:
-<pre>
-data LocalIdDetails
- = NotExported -- Not exported
- | Exported -- Exported
- | SpecPragma -- Not exported, but not to be discarded either
- -- It's unclean that this is so deeply built in
-</pre>
-From this we can tell whether the <code>LocalId</code> is exported, and that
-tells us whether we can drop an unused binding as dead code.
-<p>
-The <code>SpecPragma</code> thing is a HACK. Suppose you write a SPECIALIZE pragma:
-<pre>
- foo :: Num a => a -> a
- {-# SPECIALIZE foo :: Int -> Int #-}
- foo = ...
-</pre>
-The type checker generates a dummy call to <code>foo</code> at the right types:
-<pre>
- $dummy = foo Int dNumInt
-</pre>
-The Id <code>$dummy</code> is marked <code>SpecPragma</code>. Its role is to hang
-onto that call to <code>foo</code> so that the specialiser can see it, but there
-are no calls to <code>$dummy</code>.
-The simplifier is careful not to discard <code>SpecPragma</code> Ids, so that it
-reaches the specialiser. The specialiser processes the right hand side of a <code>SpecPragma</code> Id
-to find calls to overloaded functions, <em>and then discards the <code>SpecPragma</code> Id</em>.
-So <code>SpecPragma</code> behaves a like <code>Exported</code>, at least until the specialiser.
-
-
-<h3> ExternalNames and InternalNames </h3>
-
-Notice that whether an Id is a <code>LocalId</code> or <code>GlobalId</code> is
-not the same as whether the Id has an <code>ExternaName</code> or an <code>InternalName</code>
-(see "<a href="names.html#sort">The truth about Names</a>"):
-<ul>
-<li> Every <code>GlobalId</code> has an <code>ExternalName</code>.
-<li> A <code>LocalId</code> might have either kind of <code>Name</code>.
-</ul>
-
-<!-- hhmts start -->
-Last modified: Fri Sep 12 15:17:18 BST 2003
-<!-- hhmts end -->
- </small>
- </body>
-</html>
-