summaryrefslogtreecommitdiff
path: root/ghc/docs/comm/the-beast/names.html
blob: 061fae3ebfefe2723ba101993bdfafd7ace1efc9 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
<html>
  <head>
    <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
    <title>The GHC Commentary - The truth about names: OccNames, and Names</title>
  </head>

  <body BGCOLOR="FFFFFF">
    <h1>The GHC Commentary - The truth about names: OccNames, and Names</h1>
    <p>
      Every entity (type constructor, class, identifier, type variable) has a
      <code>Name</code>.  The <code>Name</code> type is pervasive in GHC, and
      is defined in <code>basicTypes/Name.lhs</code>.  Here is what a Name
      looks like, though it is private to the Name module.
    </p>
    <blockquote>
      <pre>
data Name = Name {
	      n_sort :: NameSort,	-- What sort of name it is
	      n_occ  :: !OccName,	-- Its occurrence name
	      n_uniq :: Unique,		-- Its identity
	      n_loc  :: !SrcLoc		-- Definition site
	  }</pre>
    </blockquote>
    <ul>
      <li> The <code>n_sort</code> field says what sort of name this is: see
	<a href="#sort">NameSort below</a>.
      <li> The <code>n_occ</code> field gives the "occurrence name" of the
	Name; see 
	<a href="#occname">OccName below</a>.
      <li> The <code>n_uniq</code> field allows fast tests for equality of
	Names. 
      <li> The <code>n_loc</code> field gives some indication of where the
	name was bound. 
    </ul>
    
    <h2><a name="sort">The <code>NameSort</code> of a <code>Name</code></a></h2>
    <p>
      There are four flavours of <code>Name</code>:
    </p>
    <blockquote>
      <pre>
data NameSort
  = External Module (Maybe Name)
	-- (Just parent) => this Name is a subordinate name of 'parent'
	-- e.g. data constructor of a data type, method of a class
	-- Nothing => not a subordinate
 
  | WiredIn Module (Maybe Name) TyThing BuiltInSyntax
	-- A variant of External, for wired-in things

  | Internal		-- A user-defined Id or TyVar
			-- defined in the module being compiled

  | System		-- A system-defined Id or TyVar.  Typically the
			-- OccName is very uninformative (like 's')</pre>
    </blockquote>
    <ul>
      <li>Here are the sorts of Name an entity can have:
	<ul>
	  <li> Class, TyCon: External.
	  <li> Id: External, Internal, or System.
	  <li> TyVar: Internal, or System.
	</ul>
      </li>
      <li>An <code>External</code> name has a globally-unique
	(module name, occurrence name) pair, namely the 
	<em>original name</em> of the entity,
	describing where the thing was originally defined.  So for example,
	if we have
	<blockquote>
	  <pre>
module M where
  f = e1
  g = e2

module A where
  import qualified M as Q
  import M
  a = Q.f + g</pre>
	</blockquote>
	<p>
	  then the RdrNames for "a", "Q.f" and "g" get replaced (by the
	  Renamer)  by the Names "A.a", "M.f", and "M.g" respectively.
	</p>
      </li>
      <li>An <code>InternalName</code> 
	has only an occurrence name.  Distinct InternalNames may have the same
	occurrence name; use the Unique to distinguish them.
      </li>
      <li>An <code>ExternalName</code> has a unique that never changes.  It
	is never cloned.  This is important, because the simplifier invents
	new names pretty freely, but we don't want to lose the connnection
	with the type environment (constructed earlier).  An
	<code>InternalName</code> name can be cloned freely.
      </li>
      <li><strong>Before CoreTidy</strong>: the Ids that were defined at top
	level in the original source program get <code>ExternalNames</code>,
	whereas extra top-level bindings generated (say) by the type checker
	get <code>InternalNames</code>. q This distinction is occasionally
	useful for filtering diagnostic output; e.g.  for -ddump-types.
      </li>
      <li><strong>After CoreTidy</strong>: An Id with an
	<code>ExternalName</code> will generate symbols that 
	appear as external symbols in the object file.  An Id with an
	<code>InternalName</code> cannot be referenced from outside the
	module, and so generates a local symbol in the object file.  The
	CoreTidy pass makes the decision about which names should be External
	and which Internal.
      </li>
      <li>A <code>System</code> name is for the most part the same as an
	<code>Internal</code>.  Indeed, the differences are purely cosmetic:
	<ul>
	  <li>Internal names usually come from some name the
	    user wrote, whereas a System name has an OccName like "a", or "t".
	    Usually there are masses of System names with the same OccName but
	    different uniques, whereas typically there are only a handful of
	    distince Internal names with the same OccName.
	  </li>
	  <li>Another difference is that when unifying the type checker tries
	    to  unify away type variables with System names, leaving ones with
	    Internal names (to improve error messages).
	  </li>
	</ul>
      </li>
    </ul>

    <h2><a name="occname">Occurrence names: <code>OccName</code></a></h2>
    <p>
      An <code>OccName</code> is more-or-less just a string, like "foo" or
      "Tree", giving the (unqualified) name of an entity.
    </p>
    <p>
      Well, not quite just a string, because in Haskell a name like "C" could
      mean a type constructor or data constructor, depending on context.  So
      GHC defines a type <tt>OccName</tt> (defined in
      <tt>basicTypes/OccName.lhs</tt>) that is a pair of a <tt>FastString</tt>
      and a <tt>NameSpace</tt> indicating which name space the name is drawn
      from:
      <blockquote>
      <pre>
data OccName = OccName NameSpace EncodedFS</pre>
    </blockquote>
    <p>
      The <tt>EncodedFS</tt> is a synonym for <tt>FastString</tt> indicating
      that the string is Z-encoded.  (Details in <tt>OccName.lhs</tt>.)
      Z-encoding encodes funny characters like '%' and '$' into alphabetic
      characters, like "zp" and "zd", so that they can be used in object-file
      symbol tables without confusing linkers and suchlike.
    </p>
    <p>
      The name spaces are:
    </p>
    <ul>
      <li> <tt>VarName</tt>: ordinary variables</li>
      <li> <tt>TvName</tt>: type variables</li>
      <li> <tt>DataName</tt>: data constructors</li>
      <li> <tt>TcClsName</tt>: type constructors and classes (in Haskell they
	share a name space) </li>
    </ul>

    <small>
<!-- hhmts start -->
Last modified: Wed May  4 14:57:55 EST 2005
<!-- hhmts end -->
    </small>
  </body>
</html>