summaryrefslogtreecommitdiff
path: root/ghc/docs/comm/the-beast/driver.html
blob: fbf65e33e79ed3721dcc1650e42f23bcd6d2c974 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<html>
  <head>
    <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
    <title>The GHC Commentary - The Glorious Driver</title>
  </head>

  <body BGCOLOR="FFFFFF">
    <h1>The GHC Commentary - The Glorious Driver</h1>
    <p>
      The Glorious Driver (GD) is the part of GHC that orchestrates the
      interaction of all the other pieces that make up GHC.  It supersedes the
      <em>Evil Driver (ED),</em> which was a Perl script that served the same
      purpose and was in use until version 4.08.1 of GHC.  Simon Marlow
      eventually slayed the ED and instated the GD.  The GD is usually called
      the <em>Compilation Manager</em> these days.
    </p>
    <p>
      The GD has been substantially extended for GHCi, i.e., the interactive
      variant of GHC that integrates the compiler with a (meta-circular)
      interpreter since version 5.00.  Most of the driver is located in the
      directory 
      <a
      href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/"><code>fptools/ghc/compiler/main/</code></a>.
    </p>

    <h2>Command Line Options</h2>
    <p>
      GHC's many flavours of command line options make the code interpreting
      them rather involved.  The following provides a brief overview of the
      processing of these options.  Since the addition of the interactive
      front-end to GHC, there are two kinds of options: <em>static
      options</em> and <em>dynamic options.</em> The former can only be set
      when the system is invoked, whereas the latter can be altered in the
      course of an interactive session.  A brief explanation on the difference
      between these options and related matters is at the start of the module
      <a
      href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/CmdLineOpts.lhs"><code>CmdLineOpts</code></a>.
      The same module defines the enumeration <code>DynFlag</code>, which
      contains all dynamic flags.  Moreover, there is the labelled record
      <code>DynFlags</code> that collects all the flag-related information
      that is passed by the compilation manager to the compiler proper,
      <code>hsc</code>, whenever a compilation is triggered.  If you like to
      find out whether an option is static, use the predicate
      <code>isStaticHscFlag</code> in the same module.
    <p>
      The second module that contains a lot of code related to the management
      of flags is <a
      href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/DriverFlags.hs"><code>DriverFlags.hs</code></a>.
      In particular, the module contains two association lists that map the
      textual representation of the various flags to a data structure that
      tells the driver how to parse the flag (e.g., whether it has any
      arguments) and provides its internal representation.  All static flags
      are contained in <code>static_flags</code>.  A whole range of
      <code>-f</code> flags can be negated by adding a <code>-f-no-</code>
      prefix.  These flags are contained in the association list
      <code>fFlags</code>.
    <p>
      The driver uses a nasty hack based on <code>IORef</code>s that permits
      the rest of the compiler to access static flags as CAFs; i.e., there is
      a family of toplevel variable definitions in 
      <a
      href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/CmdLineOpts.lhs"><code>CmdLineOpts</code></a>,
      below the literate section heading <i>Static options</i>, each of which
      contains the value of one static option.  This is essentially realised
      via global variables (in the sense of C-style, updatable, global
      variables) defined via an evil pre-processor macro named
      <code>GLOBAL_VAR</code>, which is defined in a particularly ugly corner
      of GHC, namely the C header file 
      <a
      href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/HsVersions.h"><code>HsVersions.h</code></a>. 

    <h2>What Happens When</h2>
    <p>
      Inside the Haskell compiler proper (<code>hsc</code>), a whole series of
      stages (``passes'') are executed in order to transform your Haskell program
      into C or native code.  This process is orchestrated by
      <code>main/HscMain.hscMain</code> and its relative
      <code>hscReComp</code>.  The latter directly invokes, in order,
      the parser, the renamer, the typechecker, the desugarer, the
      simplifier (Core2Core), the CoreTidy pass, the CorePrep pass,
      conversion to STG (CoreToStg), the interface generator
      (MkFinalIface), the code generator, and code output.  The
      simplifier is the most complex of these, and is made up of many
      sub-passes.  These are controlled by <code>buildCoreToDo</code>,
      as described below.

    <h2>Scheduling Optimisations Phases</h2>
    <p>
      GHC has a large variety of optimisations at its disposal, many of which
      have subtle interdependencies.  The overall plan for program
      optimisation is fixed in <a
      href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/DriverState.hs"><code>DriverState.hs</code></a>.
      First of all, there is the variable <code>hsc_minusNoO_flags</code> that
      determines the <code>-f</code> options that you get without
      <code>-O</code> (aka optimisation level 0) as well as
      <code>hsc_minusO_flags</code> and <code>hsc_minusO2_flags</code> for
      <code>-O</code> and <code>-O2</code>.
    <p>
      However, most of the strategic decisions about optimisations on the
      intermediate language Core are encoded in the value produced by
      <code>buildCoreToDo</code>, which is a list with elements of type
      <code>CoreToDo</code>.  Each element of this list specifies one step in
      the sequence of core optimisations executed by the <a
      href="simplifier.html">Mighty Simplifier</a>.  The type
      <code>CoreToDo</code> is defined in <a
      href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/CmdLineOpts.lhs"><code>CmdLineOpts.lhs</code></a>.
      The actual execution of the optimisation plan produced by
      <code>buildCoreToDo</code> is performed by <a
      href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/simplCore/SimplCore.lhs"><code>SimpleCore</code></a><code>.doCorePasses</code>.
      Core optimisation plans consist of a number of simplification phases
      (currently, three for optimisation levels of 1 or higher) with
      decreasing phase numbers (the lowest, corresponding to the last phase,
      namely 0).  Before and after these phases, optimisations such as
      specialisation, let floating, worker/wrapper, and so on are executed.
      The sequence of phases is such that the synergistic effect of the phases
      is maximised -- however, this is a fairly fragile arrangement.
    <p>
      There is a similar construction for optimisations on STG level stored in
      the variable <code>buildStgToDo :: [StgToDo]</code>.  However, this is a
      lot less complex than the arrangement for Core optimisations.

    <h2>Linking the <code>RTS</code> and <code>libHSstd</code></h2>
    <p>
      Since the RTS and HSstd refer to each other, there is a Cunning
      Hack to avoid putting them each on the command-line twice or
      thrice (aside: try asking for `plaice and chips thrice' in a
      fish and chip shop; bet you only get two lots).  The hack involves 
      adding
      the symbols that the RTS needs from libHSstd, such as
      <code>PrelWeak_runFinalizzerBatch_closure</code> and
      <code>__stginit_Prelude</code>, to the link line with the
      <code>-u</code> flag.  The standard library appears before the
      RTS on the link line, and these options cause the corresponding
      symbols to be picked up even so the linked might not have seen them
      being used as the RTS appears later on the link line.  As a result,
      when the RTS is also scanned, these symbols are already resolved. This
      avoids the linker having to read the standard library and RTS
      multiple times.
    </p>
    <p>
      This does, however, leads to a complication.  Normal Haskell
      programs do not have a <code>main()</code> function, so this is
      supplied by the RTS (in the file 
      <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/rts/Main.c"><code>Main.c</code></a>).
      It calls <code>startupHaskell</code>, which
      itself calls <code>__stginit_PrelMain</code>, which is therefore,
      since it occurs in the standard library, one of the symbols
      passed to the linker using the <code>-u</code> option.  This is fine
      for standalone Haskell programs, but as soon as the Haskell code is only
      used as part of a program implemented in a foreign language, the
      <code>main()</code> function of that foreign language should be used
      instead of that of the Haskell runtime.  In this case, the previously
      described arrangement unfortunately fails as 
      <code>__stginit_PrelMain</code> had better not be linked in,
      because it tries to call <code>__stginit_Main</code>, which won't
      exist.  In other words, the RTS's <code>main()</code> refers to 
      <code>__stginit_PrelMain</code> which in turn refers to
      <code>__stginit_Main</code>.  Although the RTS's <code>main()</code> 
      might not be linked in if the program provides its own, the driver 
      will normally force <code>__stginit_PrelMain</code> to be linked in anyway,
      using <code>-u</code>, because it's a back-reference from the
      RTS to HSstd.  This case is coped with by the <code>-no-hs-main</code>
      flag, which suppresses passing the corresonding <code>-u</code> option
      to the linker -- although in some versions of the compiler (e.g., 5.00.2)
      it didn't work.  In addition, the driver generally places the C program 
      providing the <code>main()</code> that we want to use before the RTS
      on the link line.  Therefore, the RTS's main is never used and
      without the <code>-u</code> the label <code>__stginit_PrelMain</code> 
      will not be linked.
    </p>
    
    <p><small>
<!-- hhmts start -->
Last modified: Tue Feb 19 11:09:00 UTC 2002
<!-- hhmts end -->
    </small>
  </body>
</html>