diff options
Diffstat (limited to 'docs')
60 files changed, 3985 insertions, 10099 deletions
diff --git a/docs/backpack/.gitignore b/docs/backpack/.gitignore new file mode 100644 index 0000000000..c3eb46ecd6 --- /dev/null +++ b/docs/backpack/.gitignore @@ -0,0 +1,10 @@ +*.aux +*.bak +*.bbl +*.blg +*.dvi +*.fdb_latexmk +*.fls +*.log +*.synctex.gz +backpack-impl.pdf diff --git a/docs/backpack/Makefile b/docs/backpack/Makefile new file mode 100644 index 0000000000..0dd7a9dad5 --- /dev/null +++ b/docs/backpack/Makefile @@ -0,0 +1,2 @@ +backpack-impl.pdf: backpack-impl.tex + latexmk -pdf -latexoption=-halt-on-error -latexoption=-file-line-error -latexoption=-synctex=1 backpack-impl.tex && touch paper.dvi || ! rm -f $@ diff --git a/docs/backpack/arch.png b/docs/backpack/arch.png Binary files differnew file mode 100644 index 0000000000..d8b8fd21f9 --- /dev/null +++ b/docs/backpack/arch.png diff --git a/docs/backpack/backpack-impl.bib b/docs/backpack/backpack-impl.bib new file mode 100644 index 0000000000..6bda35a8ea --- /dev/null +++ b/docs/backpack/backpack-impl.bib @@ -0,0 +1,17 @@ +@inproceedings{Kilpatrick:2014:BRH:2535838.2535884, + author = {Kilpatrick, Scott and Dreyer, Derek and Peyton Jones, Simon and Marlow, Simon}, + title = {Backpack: Retrofitting Haskell with Interfaces}, + booktitle = {Proceedings of the 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages}, + series = {POPL '14}, + year = {2014}, + isbn = {978-1-4503-2544-8}, + location = {San Diego, California, USA}, + pages = {19--31}, + numpages = {13}, + url = {http://doi.acm.org/10.1145/2535838.2535884}, + doi = {10.1145/2535838.2535884}, + acmid = {2535884}, + publisher = {ACM}, + address = {New York, NY, USA}, + keywords = {applicative instantiation, haskell modules, mixin modules, module systems, packages, recursive modules, separate modular development, type systems}, +} diff --git a/docs/backpack/backpack-impl.tex b/docs/backpack/backpack-impl.tex new file mode 100644 index 0000000000..b0b43ba87a --- /dev/null +++ b/docs/backpack/backpack-impl.tex @@ -0,0 +1,1861 @@ +\documentclass{article} + +\usepackage{graphicx} %[pdftex] OR [dvips] +\usepackage{fullpage} +\usepackage{wrapfig} +\usepackage{float} +\usepackage{titling} +\usepackage{hyperref} +\usepackage{tikz} +\usepackage{color} +\usetikzlibrary{arrows} +\usetikzlibrary{positioning} +\setlength{\droptitle}{-6em} + +\input{commands-new-new.tex} + +\newcommand{\nuAA}{\nu_\mathit{AA}} +\newcommand{\nuAB}{\nu_\mathit{AB}} +\newcommand{\nuGA}{\nu_\mathit{GA}} +\newcommand{\nuGB}{\nu_\mathit{GB}} +\newcommand{\betaPL}{\beta_\mathit{PL}} +\newcommand{\betaAA}{\beta_\mathit{AA}} +\newcommand{\betaAS}{\beta_\mathit{AS}} +\newcommand{\thinandalso}{\hspace{.45cm}} +\newcommand{\thinnerandalso}{\hspace{.38cm}} + +\input{commands-rebindings.tex} + +\newcommand{\ghcfile}[1]{\textsl{#1}} + +\title{Implementing Backpack} + +\begin{document} + +\maketitle + +The purpose of this document is to describe an implementation path +for Backpack in GHC\@. + +We start off by outlining the current architecture of GHC, ghc-pkg and Cabal, +which constitute the existing packaging system. We then state what our subgoals +are, since there are many similar sounding but different problems to solve. Next, +we describe the ``probably correct'' implementation plan, and finish off with +some open design questions. This is intended to be an evolving design document, +so please contribute! + +\tableofcontents + +\section{Current packaging architecture} + +The overall architecture is described in Figure~\ref{fig:arch}. + +\begin{figure}[H] + \center{\scalebox{0.8}{\includegraphics{arch.png}}} +\label{fig:arch}\caption{Architecture of GHC, ghc-pkg and Cabal. Green bits indicate additions from upcoming IHG work, red bits indicate additions from Backpack. Orange indicates a Haskell library.} +\end{figure} + +Here, arrows indicate dependencies from one component to another. Color +coding is as follows: orange components are libaries, green components +are to be added with the IHG work, red components are to be added with +Backpack. (Thus, black and orange can be considered the current) + +\subsection{Installed package database} + +Starting from the bottom, we have the \emph{installed package database} +(actually a collection of such databases), which stores information +about what packages have been installed are thus available to be +compiled against. There is both a global database (for the system +administrator) and a local database (for end users), which can be +updated independently. One way to think about the package database +is as a \emph{cache of object code}. In principle, one could compile +any piece of code by repeatedly recompiling all of its dependencies; +the installed package database describes when this can be bypassed. + +\begin{figure}[H] + \center{\scalebox{0.8}{\includegraphics{pkgdb.png}}} +\label{fig:pkgdb}\caption{Anatomy of a package database.} +\end{figure} + +In Figure~\ref{fig:pkgdb}, we show the structure of a package database. +The installed package are created from a Cabal file through the process +of dependency resolution and compilation. In database terms, the primary key +of a package database is the InstalledPackageId +(Figure~\ref{fig:current-pkgid}). This ID uniquely identifies an +instance of an installed package. The PackageId omits the ABI hash and +is used to qualify linker exported symbols: the current value of this +parameter is communicated to GHC using the \verb|-package-id| flag. + +In principle, packages with different PackageIds should be linkable +together in the same compiled program, whereas packages with the same +PackageId are not (even if they have different InstalledPackageIds). In +practice, GHC is currently only able to select one version of a package, +as it clears out all old versions of the package in +\ghcfile{compiler/main/Package.lhs}:applyPackageFlag. + +\begin{figure} + \center{\begin{tabular}{r l} + PackageId & package name, package version \\ + InstalledPackageId & PackageId, ABI hash \\ + \end{tabular}} +\label{fig:current-pkgid}\caption{Current structure of package identifiers.} +\end{figure} + +The database entry itself contains the information from the installed package ID, +as well as information such as what dependencies it was linked against, where +its compiled code and interface files live, its compilation flags, what modules +it exposes, etc. Much of this information is only relevant to Cabal; GHC +uses a subset of the information in the package database. + +\subsection{GHC} + +The two programs which access the package database directly are GHC +proper (for compilation) and ghc-pkg (which is a general purpose +command line tool for manipulating the database.) GHC relies on +the package database in the following ways: + +\begin{itemize} + \item It imports the local and global package databases into + its runtime database, and applies modifications to the exposed + and trusted status of the entries via the flags \verb|-package| + and others (\ghcfile{compiler/main/Packages.lhs}). The internal + package state can be seen at \verb|-v4| or higher. + \item It uses this package database to find the location of module + interfaces when it attempts to load the module info of an external + module (\ghcfile{compiler/iface/LoadIface.hs}). +\end{itemize} + +GHC itself performs a type checking phase, which generates an interface +file representing the module (so that later invocations of GHC can load the type +of a module), and then after compilation projects object files and linked archives +for programs to use. + +\paragraph{Original names} Original names are an important design pattern +in GHC\@. +Sometimes, a name can be exposed in an hi file even if its module +wasn't exposed. Here is an example (compiled in package R): + +\begin{verbatim} +module X where + import Internal (f) + g = f + +module Internal where + import Internal.Total (f) +\end{verbatim} + +Then in X.hi: + +\begin{verbatim} +g = <R.id, Internal.Total, f> (this is the original name) +\end{verbatim} + +(The reason we refer to the package as R.id is because it's the +full package ID, and not just R). + +\subsection{hs-boot} + +\verb|hs-boot| is a special mechanism used to support recursive linking +of modules within a package, today. Suppose I have a recursive module +dependency between modules and A and B. I break one of\ldots + +(ToDo: describe how hs-boot mechanism works) + +\subsection{Cabal} + +Cabal is the build system for GHC, we can think of it as parsing a Cabal +file describing a package, and then making (possibly multiple) +invocations to GHC to perform the appropriate compilation. What +information does Cabal pass onto GHC\@? One can get an idea for this by +looking at a prototypical command line that Cabal invokes GHC with: + +\begin{verbatim} +ghc --make + -package-name myapp-0.1 + -hide-all-packages + -package-id containers-0.9-ABCD + Module1 Module2 +\end{verbatim} + +There are a few things going on here. First, Cabal has to tell GHC +what the name of the package it's compiling (otherwise, GHC can't appropriately +generate symbols that other code referring to this package might generate). +There are also a number of commands which configure its in-memory view of +the package database (GHC's view of the package database may not directly +correspond to what is on disk). There's also an optimization here: in principle, +GHC can compile each module one-by-one, but instead we use the \verb|--make| flag +because this allows GHC to reuse some data structures, resulting in a nontrivial +speedup. + +(ToDo: describe cabal-install/sandbox) + +\section{Goals} + +Here are some of the high-level goals which motivate our improvements to +the module system. + +\begin{itemize} + \item Solve \emph{Cabal hell}, a situation which occurs when conflicting + version ranges on a wide range of dependencies leads to a situation + where it is impossible to satisfy the constraints. We're seeking + to solve this problem in two ways: first, we want to support + multiple instances of containers-2.9 in the database which are + compiled with different dependencies (and even link them + together), and second, we want to abolish (often inaccurate) + version ranges and move to a regime where packages depend on + signatures. Version ranges may still be used to indicate important + semantic changes (e.g., bugs or bad behavior on the part of package + authors), but they should no longer drive dependency resolution + and often only be recorded after the fact. + + \item Support \emph{hermetic builds with sharing}. A hermetic build + system is one which simulates rebuilding every package whenever + it is built; on the other hand, actually rebuilding every time + is extremely inefficient (but what happens in practice with + Cabal sandboxes). We seek to solve this problem with the IHG work, + by allowing multiple instances of a package in the database, where + the only difference is compilation parameters. We don't care + about being able to link these together in a single program. + + \item Support \emph{module-level pluggability} as an alternative to + existing (poor) usage of type classes. The canonical example are + strings, where a library might want to either use the convenient + but inefficient native strings, or the efficient packed Text data + type, but would really like to avoid having to say \verb|StringLike s => ...| + in all of their type signatures. While we do not plan on supporting + separate compilation, Cabal should understand how to automatically + recompile these ``indefinite'' packages when they are instantiated + with a new plugin. + + \item Support \emph{separate modular development}, where a library and + an application using the library can be developed and type-checked + separately, intermediated by an interface. The canonical example + here is the \verb|ghc-api|, which is a large, complex API that + the library writers (GHC developers) frequently change---the ability + for downstream projects to say, ``Here is the API I'm relying on'' + without requiring these projects to actually be built would greatly + assist in keeping the API stable. This is applicable in + the pluggability example as well, where we want to ensure that all + of the $M \times N$ configurations of libraries versus applications + type check, by only running the typechecker $M + N$ times. A closely + related concern is related toolchain support for extracting a signature + from an existing implementation, as current Haskell culture is averse + to explicitly writing separate signature files. + + \item Subsume existing support for \emph{mutually recursive modules}, + without the double vision problem. +\end{itemize} + +A \emph{non-goal} is to allow users to upgrade upstream libraries +without recompiling downstream. This is an ABI concern and we're not +going to worry about it. + +\section{Module identities} + +We are going to implement module identities slightly differently from +the way it was described from the Backpack paper. Motivated by +implementation considerations, we coarsen the +granularity of dependency tracking, so that it's not necessary to +calculate the transitive dependencies of every module: we only do it per +package. In this next section, we recapitulate Section 3.1 of the +original Backpack paper, but with our new granularity. Comparisons to +original Backpack will be recorded in footnotes. Then we more generally +discuss the differing points of the design space these two occupy, and +how this affects what programs typecheck and how things are actually +implemented. + +\subsection{The new scheme} + +\begin{wrapfigure}{R}{0.5\textwidth} +\begin{myfig} +\[ +\begin{array}{@{}lr@{\;}c@{\;}l@{}} + \text{Package Names (\texttt{PkgName})} & P &\in& \mathit{PkgNames} \\ + \text{Module Path Names (\texttt{ModName})} & p &\in& \mathit{ModPaths} \\ + \text{Module Identity Vars} & \alpha,\beta &\in& \mathit{IdentVars} \\ + \text{Package Key (\texttt{PackageId})} & \K &::=& P(\vec{p\mapsto\nu}) \\ + \text{Module Identities (\texttt{Module})} & \nu &::=& + \alpha ~|~ + \mu\alpha.\K\colon\! p \\ + \text{Module Identity Substs} & \phi,\theta &::=& + \{\vec{\alpha \coloneqq \nu}\} \\ +\end{array} +\] +\caption{Module Identities} +\label{fig:mod-idents} +\end{myfig} +\end{wrapfigure} + +Physical module +identities $\nu$, referred to in GHC as \emph{original names}, are either (1) \emph{variables} $\alpha$, which are +used to represent holes; (2) a concrete module $p$ defined in package +$P$, with holes instantiated with other module identities (might be +empty)\footnote{In Paper Backpack, we would refer to just $P$:$p$ as the identity +constructor. However, we've written the subterms specifically next to $P$ to highlight the semantic difference of these terms.}; or (3) \emph{recursive} module identities, defined via +$\mu$-constructors.\footnote{Actually, all concrete modules implicitly + define a $\mu$-constructor, and we plan on using de Bruijn indices + instead of variables in this case, a locally nameless +representation.} + +As in traditional Haskell, every package contains a number of module +files at some module path $p$; within a package these paths are +guaranteed to be unique.\footnote{In Paper Backpack, the module expressions themselves are used to refer to globally unique identifiers for each literal. This makes the metatheory simpler, but for implementation purposes it is convenient to conflate the \emph{original} module path that a module is defined at with its physical identity.} When we write inline module definitions, we assume +that they are immediately assigned to a module path $p$ which is incorporated +into their identity. A module identity $\nu$ simply augments this +with subterms $\vec{p\mapsto\nu}$ representing how \emph{all} holes in the package $P$ +were instantiated.\footnote{In Paper Backpack, we do not distinguish between holes/non-holes, and we consider all imports of the \emph{module}, not the package.} This naming is stable because the current Backpack surface syntax does not allow a logical path in a package +to be undefined. A package key is $P(\vec{p\mapsto\nu})$; it is the entity +that today is internally referred to in GHC as \texttt{PackageId}. + +Here is the very first example from +Section 2 of the original Backpack paper, \pname{ab-1}: + +\begin{example} +\Pdef{ab-1}{ + \Pmod{A}{x = True} + \Pmod{B}{\Mimp{A}; y = not x} + % \Pmodd{C}{\mname{A}} +} +\end{example} + +The identities of \m{A} and \m{B} are +\pname{ab-1}:\mname{A} and \pname{ab-1}:\mname{B}, respectively.\footnote{In Paper Backpack, the identity for \mname{B} records its import of \mname{A}, but since it is definite, this is strictly redundant.} In a package with holes, each +hole gets a fresh variable (within the package definition) as its +identity, and all of the holes associated with package $P$ are recorded. Consider \pname{abcd-holes-1}: + +\begin{example} +\Pdef{abcd-holes-1}{ + \Psig{A}{x :: Bool} % chktex 26 + \Psig{B}{y :: Bool} % chktex 26 + \Pmod{C}{x = False} + \Pmodbig{D}{ + \Mimpq{A}\\ + \Mimpq{C}\\ + % \Mexp{\m{A}.x, z}\\ + z = \m{A}.x \&\& \m{C}.x + } +} +\end{example} + +The identities of the four modules +are, in order, $\alpha_a$, $\alpha_b$, $\pname{abcd-holes-1}(\alpha_a,\alpha_b)$:\mname{C}, and +$\pname{abcd-holes-1}(\alpha_a,\alpha_b)$:\mname{D}.\footnote{In Paper Backpack, the granularity is at the module level, so the subterms of \mname{C} and \mname{D} can differ.} + +Consider now the module identities in the \m{Graph} instantiations in +\pname{multinst}, shown in Figure 2 of the original Backpack paper (we have +omitted it for brevity). +In the definition of \pname{structures}, assume that the variables for +\m{Prelude} and \m{Array} are $\alpha_P$ and $\alpha_A$ respectively. +The identity of \m{Graph} is $\pname{structures}(\alpha_P, \alpha_A)$:\m{Graph}. Similarly, the identities of the two array implementations +are $\nu_{AA} = \pname{arrays-a}(\alpha_P)$:\m{Array} and +$\nu_{AB} = \pname{arrays-b}(\alpha_P)$:\m{Array}.\footnote{Notice that the subterms coincide with Paper Backpack! A sign that module level granularity is not necessary for many use-cases.} + +The package \pname{graph-a} is more interesting because it +\emph{links} the packages \pname{arrays-a} and \pname{structures} +together, with the implementation of \m{Array} from \pname{arrays-a} +\emph{instantiating} the hole \m{Array} from \pname{structures}. This +linking is reflected in the identity of the \m{Graph} module in +\pname{graph-a}: whereas in \pname{structures} it was $\nu_G = +\pname{structures}(\alpha_P, \alpha_A)$:\m{Graph}, in \pname{graph-a} it is +$\nu_{GA} = \nu_G[\nu_{AA}/\alpha_A] = \pname{structures}(\alpha_P, \nu_{AA})$:\m{Graph}. Similarly, the identity of \m{Graph} in +\pname{graph-b} is $\nu_{GB} = \nu_G[\nu_{AB}/\alpha_A] = +\pname{structures}(\alpha_P, \nu_{AB})$:\m{Graph}. Thus, linking consists +of substituting the variable identity of a hole by the concrete +identity of the module filling that hole. + +Lastly, \pname{multinst} makes use of both of these \m{Graph} +modules, under the aliases \m{GA} and \m{GB}, respectively. +Consequently, in the \m{Client} module, \code{\m{GA}.G} and +\code{\m{GB}.G} will be correctly viewed as distinct types since they +originate in modules with distinct identities. + +As \pname{multinst} illustrates, module identities effectively encode +dependency graphs at the package level.\footnote{In Paper Backpack, module identities +encode dependency graphs at the module level. In both cases, however, what is being +depended on is always a module.} Like in Paper Backpack, we have an \emph{applicative} +semantics of instantiation, and the applicativity example in Figure 3 of the +Backpack paper still type checks. However, because we are operating at a coarser +granularity, modules may have spurious dependencies on holes that they don't +actually depend on, which means less type equalities may hold. + +Shaping proceeds in the same way as in Paper Backpack, except that the +shaping judgment must also accept the package key +$P(\vec{p\mapsto\alpha})$ so we can create identifiers with +\textsf{mkident}. This implies we must know ahead of time what the holes +of a package are. + +\subsection{Commentary} + +\begin{wrapfigure}{r}{0.4\textwidth} +\begin{verbatim} +package p where + A :: ... + -- B does not import A + B = [ data T = T; f T = T ] + C = [ import A; ... ] +package q where + A1 = [ ... ] + A2 = [ ... ] + include p (A as A1, B as B1) + include p (A as A2, B as B2) + Main = [ + import qualified B1 + import qualified B2 + y = B1.f B2.T + ] +\end{verbatim} +\caption{The difference between package and module granularity}\label{fig:granularity} +\end{wrapfigure} + +\paragraph{The sliding scale of granularity} The scheme we have described +here is coarser-grained than Backpack's, and thus does not accept as many +programs. Figure~\ref{fig:granularity} is a minimal example which doesn't type +check in our new scheme. +In Paper Backpack, the physical module identities of \m{B1} and \m{B2} are +both $\K_B$, and so \m{Main} typechecks. However, in GHC Backpack, +we assign module identities $\pname{p(q:A1)}$:$\m{B}$ and $\pname{p(q:A2)}$:$\m{B}$, +which are not equal. + +Does this mean that Paper Backpack's form of granularity is \emph{better?} +Not necessarily! First, we can always split packages into further subpackages +which better reflect the internal hole dependencies, so it is always possible +to rewrite a program to make it typecheck---just with more packages. Second, +Paper Backpack's granularity is only one on a sliding scale; it is by no means +the most extreme! You could imagine a version of Backpack where we desugared +each \emph{expression} into a separate module.\footnote{Indeed, there are some +languages which take this stance. (See Bob Harper's work.)} Then, even if \m{B} imported +\m{A}, as long as it didn't use any types from \m{A} in the definition of +\verb|T|, we would still consider the types equal. Finally, to understand +what the physical module identity of a module is, in Paper Backpack I must +understand the internal dependency structure of the modules in a package. This +is a lot of work for the developer to think about; a more granular model +is also easier to reason about. + +Nevertheless, finer granularity can be desirable from an end-user perspective. +Usually, these circumstances arise when library-writers are forced to split their +components into many separate packages, when they would much rather have written +a single package. For example, if I define a data type in my library, and would +like to define a \verb|Lens| instance for it, I would create a new package just +for the instance, in order to avoid saddling users who aren't interested in lenses +with an extra dependency. Another example is test suites, which have dependencies +on various test frameworks that a user won't care about if they are not planning +on testing the code. (Cabal has a special case for this, allowing the user +to write effectively multiple packages in a single Cabal file.) + +\paragraph{Cabal dependency resolution} Currently, when we compile a Cabal +package, Cabal goes ahead and resolves \verb|build-depends| entries with actual +implementations, which we compile against. A planned addition to the package key, +independent of Backpack, is to record the transitive dependency tree selected +during this dependency resolution process, so that we can install \pname{libfoo-1.0} +twice compiled against different versions of its dependencies. +What is the relationship to this transitive dependency tree of \emph{packages}, +with the subterms of our package identities which are \emph{modules}? Does one +subsume the other? In fact, these are separate mechanisms---two levels of indirections, +so to speak. + +To illustrate, suppose I write a Cabal file with \verb|build-depends: foobar|. A reasonable assumption is that this translates into a +Backpack package which has \verb|include foobar|. However, this is not +actually a Paper Backpack package: Cabal's dependency solver has to +rewrite all of these package references into versioned references +\verb|include foobar-0.1|. For example, this is a pre-package: + +\begin{verbatim} +package foo where + include bar +\end{verbatim} + +and this is a Paper Backpack package: + +\begin{verbatim} +package foo-0.3[bar-0.1[baz-0.2]] where + include bar-0.1[baz-0.2] +\end{verbatim} + +This tree is very similar to the one tracking dependencies for holes, +but we must record this tree \emph{even} when our package has no holes. +As a final example, the full module +identity of \m{B1} in Figure~\ref{fig:granularity} may actually be $\pname{p-0.9(q-1.0[p-0.9]:A1)}$:\m{B}. + +\subsection{Implementation} + +In GHC's current packaging system, a single package compiles into a +single entry in the installed package database, indexed by the package +key. This property is preserved by package-level granularity, as we +assign the same package key to all modules. Package keys provide an +easy mechanism for sharing to occur: when an indefinite package is fully +instantiated, we can check if we already have its package key installed +in the installed package database. (At the end of this section, we'll +briefly discuss some of the problems actually implementing Paper Backpack.) +It is also important to note that we are \emph{willing to duplicate code}; +processes like this already happen in other parts of the compiler +(such as inlining.) + +However, there is one major challenge for this scheme, related to +\emph{dynamically linked libraries}. Consider this example: + +\begin{verbatim} +package p where + A :: [ ... ] + B = [ ... ] +package q where + A = [ ... ] + include p + C = [ import A; import B; ... ] +\end{verbatim} + +When we compile package \pname{q}, we end up compiling package keys +\pname{q} and $\pname{p(q:A)}$, which turn into their respective libraries +in the installed package database. When we need to statically link against +these libraries, it doesn't matter that \pname{q} refers to code in $\pname{p(q:A)}$, +and vice versa: the linker is building an executable and can resolve all +of the symbols in one go. However, when the libraries in question are +dynamic libraries \verb|libHSq.so| and \verb|libHSp(q:A).so|, this is now +a \emph{circular dependency} between the two libraries, and most dynamic +linkers will not be able to load either of these libraries. + +Our plan is to break the circularity by inlining the entire module \m{A} +into $\pname{p(q:A)}$ when it is necessary (perhaps in other situations, +\m{A} will be in another package and no inlining is necessary). The code +in both situations should be exactly the same, so it should be completely +permissible to consider them type-equal. + +\paragraph{Relaxing package selection restrictions} As mentioned +previously, GHC is unable to select multiple packages with the same +package name (but different package keys). This restriction needs to be +lifted. We should add a new flag \verb|-package-key|. GHC also knows +about version numbers and will mask out old versions of a library when +you make another version visible; this behavior needs to be modified. + +\paragraph{Linker symbols} As we increase the amount of information in +PackageId, it's important to be careful about the length of these IDs, +as they are used for exported linker symbols (e.g. +\verb|base_TextziReadziLex_zdwvalDig_info|). Very long symbol names +hurt compile and link time, object file sizes, GHCi startup time, +dynamic linking, and make gdb hard to use. As such, the current plan is +to do away with full package names and versions, and instead use just a +base-62 encoded hash, perhaps with the first four characters of the package +name for user-friendliness. + +\paragraph{Wired-in names} One annoying thing to remember is that GHC +has wired-in names, which refer to packages without any version. These +are specially treated during compilation so that they are built using +a package key that has no version or dependency information. One approach +is to continue treating these libraries specially; alternately we can +maintain a fixed table from these wired names to +package IDs. + +\section{Shapeless Backpack}\label{sec:simplifying-backpack} + +Backpack as currently defined always requires a \emph{shaping} pass, +which calculates the shapes of all modules defined in a package. +The shaping pass is critical to the solution of the double-vision problem +in recursive module linking, but it also presents a number of unpalatable +implementation problems: + +\begin{itemize} + + \item \emph{Shaping is a lot of work.} A module shape specifies the + providence of all data types and identifiers defined by a + module. To calculate this, we must preprocess and parse all + modules, even before we do the type-checking pass. (Fortunately, + shaping doesn't require a full parse of a module, only enough + to get identifiers. However, it does have to understand import + statements at the same level of detail as GHC's renamer.) + + \item \emph{Shaping must be done upfront.} In the current Backpack + design, all shapes must be computed before any typechecking can + occur. While performing the shaping pass upfront is necessary + in order to solve the double vision problem (where a module + identity may be influenced by later definitions), it means + that GHC must first do a shaping pass, and then revisit every module and + compile them proper. Nor is it (easily) possible to skip the + shaping pass when it is unnecessary, as one might expect to be + the case in the absence of mutual recursion. Shaping is not + a ``pay as you go'' language feature. + + \item \emph{GHC can't compile all programs shaping accepts.} Shaping + accepts programs that GHC, with its current hs-boot mechanism, cannot + compile. In particular, GHC requires that any data type or function + in a signature actually be \emph{defined} in the module corresponding + to that file (i.e., an original name can be assigned to these entities + immediately.) Shaping permits unrestricted exports to implement + modules; this shows up in the formalism as $\beta$ module variables. + + \item \emph{Shaping encourages inefficient program organization.} + Shaping is designed to enable mutually recursive modules, but as + currently implemented, mutual recursion is less efficient than + code without recursive dependencies. Programmers should avoid + this code organization, except when it is absolutely necessary. + + \item \emph{GHC is architecturally ill-suited for directly + implementing shaping.} Shaping implies that GHC's internal + concept of an ``original name'' be extended to accommodate + module variables. This is an extremely invasive change to all + aspects of GHC, since the original names assumption is baked + quite deeply into the compiler. Plausible implementations of + shaping requires all these variables to be skolemized outside + of GHC\@. + +\end{itemize} + +To be clear, the shaping pass is fundamentally necessary for some +Backpack packages. Here is the example which convinced Simon: + +\begin{verbatim} +package p where + A :: [data T; f :: T -> T] + B = [export T(MkT), h; import A(f); data T = MkT; h x = f MkT] + A = [export T(MkT), f, h; import B; f MkT = MkT] +\end{verbatim} + +The key to this example is that B \emph{may or may not typecheck} depending +on the definition of A. Because A reexports B's definition T, B will +typecheck; but if A defined T on its own, B would not typecheck. Thus, +we \emph{cannot} typecheck B until we have done some analysis of A (the +shaping analysis!) + +Thus, it is beneficial (from an optimization point of view) to +consider a subset of Backpack for which shaping is not necessary. +Here is a programming discipline which does just that, which we will call the \textbf{linking restriction}: \emph{Module implementations must be declared before +signatures.} Formally, this restriction modifies the rule for merging +polarized module shapes ($\widetilde{\tau}_1^{m_1} \oplus \widetilde{\tau}_2^{m_2}$) so that +$\widetilde{\tau}_1^- \oplus \widetilde{\tau}_2^+$ is always undefined.\footnote{This seemed to be the crispest way of defining the restriction, although this means an error happens a bit later than I'd like it to: I'd prefer if we errored while merging logical contexts, but we don't know what is a hole at that point.} + +Here is an example of the linking restriction. Consider these two packages: + +\begin{verbatim} +package random where + System.Random = [ ... ].hs + +package monte-carlo where + System.Random :: ... + System.MonteCarlo = [ ... ].hs +\end{verbatim} + +Here, random is a definite package which may have been compiled ahead +of time; monte-carlo is an indefinite package with a dependency +on any package which provides \verb|System.Random|. + +Now, to link these two applications together, only one ordering +is permissible: + +\begin{verbatim} +package myapp where + include random + include monte-carlo +\end{verbatim} + +If myapp wants to provide its own random implementation, it can do so: + +\begin{verbatim} +package myapp2 where + System.Random = [ ... ].hs + include monte-carlo +\end{verbatim} + +In both cases, all of \verb|monte-carlo|'s holes have been filled in by the time +it is included. The alternate ordering is not allowed. + +Why does this discipline prevent mutually recursive modules? Intuitively, +a hole is the mechanism by which we can refer to an implementation +before it is defined; otherwise, we can only refer to definitions which +preceed our definition. If there are never any holes \emph{which get filled}, +implementation links can only go backwards, ruling out circularity. + +It's easy to see how mutual recursion can occur if we break this discipline: + +\begin{verbatim} +package myapp2 where + include monte-carlo + System.Random = [ import System.MonteCarlo ].hs +\end{verbatim} + +\subsection{Typechecking of definite modules without shaping} + +If we are not carrying out a shaping pass, we need to be able to calculate +$\widetilde{\Xi}_{\mathsf{pkg}}$ on the fly. In the case that we are +compiling a package---there will be no holes in the final package---we +can show that shaping is unnecessary quite easily, since with the +linking restriction, everything is definite from the get-go. + +Observe the following invariant: at any given step of the module +bindings, the physical context $\widetilde{\Phi}$ contains no +holes. We can thus conclude that there are no module variables in any +type shapes. As the only time a previously calculated package shape can +change is due to unification, the incrementally computed shape is in +fact the true one. + +As far as the implementation is concerned, we never have to worry +about handling module variables; we only need to do extra typechecks +against (renamed) interface files. + +\subsection{Compilation of definite modules}\label{sec:compiling-definite} + +Of course, we still have to compile the code, and this includes any +subpackages which we have mixed in the dependencies to make them fully +definite. Let's take the following set of packages as an example: + +\begin{verbatim} +package pkg-a where + A = [ a = 0; b = 0 ] -- b is not visible + B = ... -- this code is ignored +package pgk-b where -- indefinite package + A :: [ a :: Bool ] + B = [ import A; b = 1 ] +package pkg-c where + include pkg-a (A) + include pkg-b + C = [ import B; ... ] +\end{verbatim} + +Note: in the following example, we will assume that we are operating +under the packaging scheme specified in Section~\ref{sec:one-per-definite-package} +with the indefinite package refinement. + +With the linking invariant, we can simply walk the Backpack package ``tree'', +compiling each of its dependencies. Let's walk through it explicitly.\footnote{To simplify matters, we assume that there is only one instance of any +PackageId in the database, so we omit the unique-ifying hashes from the +ghc-pkg registration commands; we ignore the existence of version numbers +and Cabal's dependency solver; finally, we do the compilation in +one-shot mode, even though Cabal in practice will use the Make mode.} + +First, we have to build \verb|pkg-a|. This package has no dependencies +of any kind, so compiling is much like compiling ordinary Haskell. If +it was already installed in the database, we wouldn't even bother compiling it. + +\begin{verbatim} +ADEPS = # empty! +ghc -c A.hs -package-name pkg-a-ADEPS +ghc -c B.hs -package-name pkg-a-ADEPS +# install and register pkg-a-ADEPS +\end{verbatim} + +Next, we have to build \verb|pkg-b|. This package has a hole \verb|A|, +intuitively, it depends on package A. This is done in two steps: +first we check if the signature given for the hole matches up with the +actual implementation provided. Then we build the module properly. + +\begin{verbatim} +BDEPS = "A -> pkg-a-ADEPS:A" +ghc -c A.hs-boot -package-name pkg-b-BDEPS -hide-all-packages \ + -package "pkg-a-ADEPS(A)" +ghc -c B.hs -package-name pkg-b-BDEPS -hide-all-packages \ + -package "pkg-a-ADEPS(A)" +# install and register pkg-b-BDEPS +\end{verbatim} + +These commands mostly resemble the traditional compilation process, but +with some minor differences. First, the \verb|-package| includes must +also specify a thinning (and renaming) list. This is because when +\verb|pkg-b| is compiled, it only sees module \verb|A| from it, not +module \verb|B| (as it was thinned out.) Conceptually, this package is +being compiled in the context of some module environment \verb|BDEPS| (a +logical context, in Backpack lingo) which maps modules to original names +and is utilized by the module finder to lookup the import in +\verb|B.hs|; we load/thin/rename packages so that the package +environment accurately reflects the module environment. + +Similarly, it is important that the compilation of \verb|B.hs| use \verb|A.hi-boot| +to determine what entities in the module are visible upon import; this is +automatically detected by \verb|GHC| when the compilation occurs. Otherwise, +in module \verb|pkg-b:B|, there would be a name collision between the local +definition of \verb|b| and the identifier \verb|b| which was +accidentally pulled in when we compiled against the actual implementation of +\verb|A|. It's actually a bit tempting to compile \verb|pkg-b:B| against the +\verb|hi-boot| generated by the signature, but this would unnecessarily +lose out on possible optimizations which are stored in the original \verb|hi| +file, but not evident from \verb|hi-boot|. + +Finally, we created all of the necessary subpackages, and we can compile +our package proper. + +\begin{verbatim} +CDEPS = # empty!! +ghc -c C.hs -package-name pkg-c-CDEPS -hide-all-packages \ + -package "pkg-a-ADEPS(A)" \ + -package "pkg-b-BDEPS" +# install and register package pkg-c-CDEPS +\end{verbatim} + +This command is quite similar, although it's worth mentioning that now, +the \verb|package| flags directly mirror the syntax in Backpack. +Additionally, even though \verb|pkg-c| ``depends'' on subpackages, these +do not show in its package-name identifier, e.g. CDEPS\@. This is +because this package \emph{chose} the values of ADEPS and BDEPS +explicitly (by including the packages in this particular order), so +there are no degrees of freedom.\footnote{In the presence of a + Cabal-style dependency solver which associates a-0.1 with a concrete +identifier a, these choices need to be recorded in the package ID.} + +Overall, there are a few important things to notice about this architecture. +First, because the \verb|pkg-b-BDEPS| product is installed, if in another package +build we instantiate the indefinite module B with exactly the same \verb|pkg-a| +implementation, we can skip the compilation process and reuse the version. +This is because the calculated \verb|BDEPS| will be the same, and thus the package +IDs will be the same. + +XXX ToDo: actually write down pseudocode algorithm for this + +\paragraph{Sometimes you need a module environment instead} In the compilation +description here, we've implicitly assumed that any external modules you might +depend on exist in a package somewhere. However, a tricky situation +occurs when some of these modules come from a parent package: +\begin{verbatim} +package pkg-b where + A :: [ a :: Bool ] + B = [ import A; b = 1 ] +package pkg-c where + A = [ a = 0; b = 0 ] + include pkg-b + C = [ import B; ... ] +\end{verbatim} + +How this problem gets resolved depends on what our library granularity is (Section~\ref{sec:flatten}). + +In the ``only definite packages are compiled'' world +(Section~\ref{sec:one-per-definite-package}), we need to pass a +special ``module environment'' to the compilation of libraries +in \verb|monte-carlo| to say where to find \verb|System.Random|. +The compilation of \verb|pkg-b| now looks as follows: + +\begin{verbatim} +BDEPS = "A -> pkg-a-ADEPS:A" +ghc -c A.hs-boot -package-name pkg-a-ADEPS -module-env BDEPS +ghc -c B.hs -package-name pkg-a-ADEPS -subpackage-name pkg-b-BDEPS -module-env BDEPS +\end{verbatim} + +The most important thing to remember here is that in the ``only definite +packages are compiled'' world, we must create a \emph{copy} of +\verb|pkg-b| in order to instantiate its hole with \verb|pkg-a:A| +(otherwise, there is a circular dependency.) These packages must be +distinguished from the parent package (\verb|-subpackage-name|), but +logically, they will be installed in the \verb|pkg-a| library. The +module environment specifies where the holes can be found, without +referring to an actual package (since \verb|pkg-a| has, indeed, not been +installed yet at the time we process \verb|B.hs|). These files are +probably looked up in the include paths.\footnote{It's worth remarking + that a variant of this was originally proposed as the one true + compilation strategy. However, it was pointed out that this gave up + applicativity in all cases. Our current refinement of this strategy +gives up applicativity for modules which have not been placed in an +external package.} + +Things are a bit different in sliced world and physical module identity +world (Section~\ref{sec:one-per-package-identity}); here, we really do +compile and install (perhaps to a local database) \verb|pkg-c:A| before +starting with the compilation of \verb|pkg-b|. So package imports will +continue to work fine. + +\subsection{Restricted recursive modules ala hs-boot}\label{sec:hs-boot-restrict} + +It should be possible to support GHC-style mutual recursion using the +Backpack formalism immediately using hs-boot files. However, to avoid +the need for a shaping pass, we must adopt an existing constraint that +already applies to hs-boot files: \emph{at the time we define a signature, +we must know what the original name for all data types is}. In practice, +GHC enforces this by stating that: (1) an hs-boot file must be +accompanied with an implementation, and (2) the implementation must +in fact define (and not reexport) all of the declarations in the signature. + +Why does this not require a shaping pass? The reason is that the +signature is not really polymorphic: we require that the $\alpha$ module +variable be resolved to a concrete module later in the same package, and +that all the $\beta$ module variables be unified with $\alpha$. Thus, we +know ahead of time the original names and don't need to deal with any +renaming.\footnote{This strategy doesn't completely resolve the problem +of cross-package mutual recursion, because we need to first compile a +bit of the first package (signatures), then the second package, and then +the rest of the first package.} + +Compiling packages in this way gives the tantalizing possibility +of true separate compilation: the only thing we don't know is what the actual +package name of an indefinite package will be, and what the correct references +to have are. This is a very minor change to the assembly, so one could conceive +of dynamically rewriting these references at the linking stage. But +separate compilation achieved in this fashion would not be able to take +advantage of cross-module optimizations. + +\section{Shaped Backpack} + +Despite the simplicity of shapeless Backpack with the linking +restriction in the absence of holes, we will find that when we have +holes, it will be very difficult to do type-checking without +some form of shaping. This section is very much a work in progress, +but the ability to typecheck against holes, even with the linking restriction, +is a very important part of modular separate development, so we will need +to support it at some ponit. + +\subsection{Efficient shaping} + +(These are Edward's opinion, he hasn't convinced other folks that this is +the right way to do it.) + +In this section, I want to argue that, although shaping constitutes +a pre-pass which must be run before compilation in earnest, it is only +about as bad as the dependency resolution analysis that GHC already does +in \verb|ghc -M| or \verb|ghc --make|. + +In Paper Backpack, what information does shaping compute? It looks at +exports, imports, data declarations and value declarations (but not the +actual expressions associated with these values.) As a matter of fact, +GHC already must look at the imports associated with a package in order +to determine the dependency graph, so that it can have some order to compile +modules in. There is a specialized parser which just parses these statements, +and then ignores the rest of the file. + +A bit of background: the \emph{renamer} is responsible for resolving +imports and figuring out where all of these entities actually come from. +SPJ would really like to avoid having to run the renamer in order to perform +a shaping pass. + +\paragraph{Is it necessary to run the Renamer to do shaping?} +Edward and Scott believe the answer is no, well, partially. +Shaping needs to know the original names of all entities exposed by a +module/signature. Then it needs to know (a) which entities a module/signature +defines/declares locally and (b) which entities that module/signature exports. +The former, (a), can be determined by a straightforward inspection of a parse +tree of the source file.\footnote{Note that no expression or type parsing +is necessary. We only need names of local values, data types, and data +constructors.} The latter, (b), is a bit trickier. Right now it's the Renamer +that interprets imports and exports into original names, so we would still +rely on that implementation. However, the Renamer does other, harder things +that we don't need, so ideally we could factor out the import/export +resolution from the Renamer for use in shaping. + +Unfortunately the Renamer's import resolution analyzes \verb|.hi| files, but for +local modules, which haven't yet been typechecked, we don't have those. +Instead, we could use a new file format, \verb|.hsi| files, to store the shape of +a locally defined module. (Defined packages are bundled with their shapes, +so included modules have \verb|.hsi| files as well.) (What about the logical +vs.~physical distinction in file names?) If we refactor the import/export +resolution code, could we rewrite it to generically operate on both +\verb|.hi| files and \verb|.hsi| files? + +Alternatively, rather than storing shapes on a per-source basis, we could +store (in memory) the entire package shape. Similarly, included packages +could have a single shape file for the entire package. Although this approach +would make shaping non-incremental, since an entire package's shape would +be recomputed any time a constituent module's shape changes, we do not expect +shaping to be all that expensive. + +\subsection{Typechecking of indefinite modules}\label{sec:typechecking-indefinite} + +Recall in our argument in the definite case, where we showed there are +no holes in the physical context. With indefinite modules, this is no +longer true. While (with the linking restriction) these holes will never +be linked against a physical implementation, they may be linked against +other signatures. (Note: while disallowing signature linking would +solve our problem, it would disallow a wide array of useful instances of +signature reuse, for example, a package mylib that implements both +mylib-1x-sig and mylib-2x-sig.) + +With holes, we must handle module variables, and we sometimes must unify them: + +\begin{verbatim} +package p where + A :: [ data A ] +package q where + A :: [ data A ] +package r where + include p + include q +\end{verbatim} + +In this package, it is not possible to a priori assign original names to +module A in p and q, because in package r, they should have the same +original name. When signature linking occurs, unification may occur, +which means we have to rename all relevant original names. (A similar +situation occurs when a module is typechecked against a signature.) + +An invariant which would be nice to have is this: when typechecking a +signature or including a package, we may apply renaming to the entities +being brought into context. But once we've picked an original name for +our entities, no further renaming should be necessary. (Formally, in the +unification for semantic object shapes, apply the unifier to the second +shape, but not the first one.) + +However, there are plenty of counterexamples here: + +\begin{verbatim} +package p where + A :: [ data A ] + B :: [ data A ] + M = ... + A = B +\end{verbatim} + +In this package, does module M know that A.A and B.A are type equal? In +fact, the shaping pass will have assigned equal module identities to A +and B, so M \emph{equates these types}, despite the aliasing occurring +after the fact. + +We can make this example more sophisticated, by having a later +subpackage which causes the aliasing; now, the decision is not even a +local one (on the other hand, the equality should be evident by inspection +of the package interface associated with q): + +\begin{verbatim} +package p where + A :: [ data A ] + B :: [ data A ] +package q where + A :: [ data A ] + B = A +package r where + include p + include q +\end{verbatim} + +Another possibility is that it might be acceptable to do a mini-shaping +pass, without parsing modules or signatures, \emph{simply} looking at +names and aliases. But logical names are not the only mechanism by +which unification may occur: + +\begin{verbatim} +package p where + C :: [ data A ] + A = [ data A = A ] + B :: [ import A(A) ] + C = B +\end{verbatim} + +It is easy to conclude that the original names of C and B are the same. But +more importantly, C.A must be given the original name of p:A.A. This can only +be discovered by looking at the signature definition for B. In any case, it +is worth noting that this situation parallels the situation with hs-boot +files (although there is no mutual recursion here). + +The conclusion is that you will probably, in fact, have to do real +shaping in order to typecheck all of these examples. + +\paragraph{Hey, these signature imports are kind of tricky\ldots} + +When signatures and modules are interleaved, the interaction can be +complex. Here is an example: + +\begin{verbatim} +package p where + C :: [ data A ] + M = [ import C; ... ] + A = [ import M; data A = A ] + C :: [ import A(A) ] +\end{verbatim} + +Here, the second signature for C refers to a module implementation A +(this is permissible: it simply means that the original name for p:C.A +is p:A.A). But wait! A relies on M, and M relies on C. Circularity? +Fortunately not: a client of package p will find it impossible to have +the hole C implemented in advance, since they will need to get their hands on module +A\ldots but it will not be defined prior to package p. + +In any case, however, it would be good to emit a warning if a package +cannot be compiled without mutual recursion. + +\subsection{Incremental typechecking} +We want to typecheck modules incrementally, i.e., when something changes in +a package, we only want to re-typecheck the modules that care about that +change. GHC already does this today.% +\footnote{\url{https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/RecompilationAvoidance}} +Is the same mechanism sufficient for Backpack? Edward and Scott think that it +is, mostly. Our conjecture is that a module should be re-typechecked if the +existing mechanism says it should \emph{or} if the logical shape +context (which maps logical names to physical names) has changed. The latter +condition is due to aliases that affect typechecking of modules. + +Let's look again at an example from before: +\begin{verbatim} +package p where + A :: [ data A ] + B :: [ data A ] + M = [ import A; import B; ... ] +\end{verbatim} +Let's say that \verb|M| is typechecked successfully. Now we add an alias binding +at the end of the package, \verb|A = B|. Does \verb|M| need to be +re-typechecked? Yes! (Well, it seems so, but let's just assert ``yes'' for now. +Certainly in the reverse case---if we remove the alias and then ask---this +is true, since \verb|M| might have depended on the two \verb|A| types +being the same.) +The logical shape context changed to say that \verb|A| and +\verb|B| now map to the same physical module identity. But does the existing +recompilation avoidance mechanism say that \verb|M| should be re-typechecked? +It's unclear. The \verb|.hi| file for \verb|M| records that it imported \verb|A| and +\verb|B| with particular ABIs, but does it also know about the physical module +identities (or rather, original module names) of these modules? + +Scott thinks this highlights the need for us to get our story straight about +the connection between logical names, physical module identities, and file +names! + + +\subsection{Installing indefinite packages}\label{sec:installing-indefinite} + +If an indefinite package contains no code at all, we only need +to install the interface file for the signatures. However, if +they include code, we must provide all of the +ingredients necessary to compile them when the holes are linked against +actual implementations. (Figure~\ref{fig:pkgdb}) + +\paragraph{Source tarball or preprocessed source?} What is the representation of the source that is saved is. There +are a number of possible choices: + +\begin{itemize} + \item The original tarballs downloaded from Hackage, + \item Preprocessed source files, + \item Some sort of internal, type-checked representation of Haskell code (maybe the output of the desugarer). +\end{itemize} + +Storing the tarballs is the simplest and most straightforward mechanism, +but we will have to be very certain that we can recompile the module +later in precisely the same we compiled it originally, to ensure the hi +files match up (fortunately, it should be simple to perform an optional +sanity check before proceeding.) The appeal of saving preprocessed +source, or even the IRs, is that this is conceptually this is exactly +what an indefinite package is: we have paused the compilation process +partway, intending to finish it later. However, our compilation strategy +for definite packages requires us to run this step using a \emph{different} +choice of original names, so it's unclear how much work could actually be reused. + +\section{Surface syntax} + +In the Backpack paper, a brand new module language is presented, with +syntax for inline modules and signatures. This syntax is probably worth implementing, +because it makes it easy to specify compatibility packages, whose module +definitions in general may be very short: + +\begin{verbatim} +package ishake-0.12-shake-0.13 where + include shake-0.13 + Development.Shake.Sys = Development.Shake.Cmd + Development.Shake = [ (**>) = (&>) ; (*>>) = (|*>)] + Development.Shake.Rule = [ defaultPriority = rule . priority 0.5 ] + include ishake-0.12 +\end{verbatim} + +However, there are a few things that are less than ideal about the +surface syntax proposed by Paper Backpack: + +\begin{itemize} + \item It's completely different from the current method users + specify packages. There's nothing wrong with this per se + (one simply needs to support both formats) but the smaller + the delta, the easier the new packaging format is to explain + and implement. + + \item Sometimes order matters (relative ordering of signatures and + module implementations), and other times it does not (aliases). + This can be confusing for users. + + \item Users have to order module definitions topologically, + whereas in current Cabal modules can be listed in any order, and + GHC figures out an appropriate order to compile them. +\end{itemize} + +Here is an alternative proposal, closely based on Cabal syntax. Given +the following Backpack definition: + +\begin{verbatim} +package libfoo(A, B, C, Foo) where + include base + -- renaming and thinning + include libfoo (Foo, Bar as Baz) + -- holes + A :: [ a :: Bool ].hsig + A2 :: [ b :: Bool ].hsig + -- normal module + B = [ + import {-# SOURCE #-} A + import Foo + import Baz + ... + ].hs + -- recursively linked pair of modules, one is private + C :: [ data C ].hsig + D = [ import {-# SOURCE #-} C; data D = D C ].hs + C = [ import D; data C = C D ].hs + -- alias + A = A2 +\end{verbatim} + +We can write the following Cabal-like syntax instead (where +all of the signatures and modules are placed in appropriately +named files): + +\begin{verbatim} +package: libfoo +... +build-depends: base, libfoo (Foo, Bar as Baz) +holes: A A2 -- deferred for now +exposed-modules: Foo B C +aliases: A = A2 +other-modules: D +\end{verbatim} + +Notably, all of these lists are \emph{insensitive} to ordering! +The key idea is use of the \verb|{-# SOURCE #-}| pragma, which +is enough to solve the important ordering constraint between +signatures and modules. + +Here is how the elaboration works. For simplicity, in this algorithm +description, we assume all packages being compiled have no holes +(including \verb|build-depends| packages). Later, we'll discuss how to +extend the algorithm to handle holes in both subpackages and the main +package itself. + +\begin{enumerate} + + \item At the top-level with \verb|package| $p$ and + \verb|exposed-modules| $ms$, record \verb|package p (ms) where| + + \item For each package $p$ with thinning/renaming $ms$ in + \verb|build-depends|, record a \verb|include p (ms)| in the + Backpack package. The ordering of these includes does not + matter, since none of these packages have holes. + + \item Take all modules $m$ in \verb|other-modules| and + \verb|exposed-modules| which were not exported by build + dependencies, and create a directed graph where hs and hs-boot + files are nodes and imports are edges (the target of an edge is + an hs file if it is a normal import, and an hs-boot file if it + is a SOURCE import). Topologically sort this graph, erroring if + this graph contains cycles (even with recursive modules, the + cycle should have been broken by an hs-boot file). For each + node, in this order, record \verb|M = [ ... ]| or \verb|M :: [ ... ]| + depending on whether or not it is an hs or hs-boot. If possible, + sort signatures before implementations when there is no constraint + otherwise. + +\end{enumerate} + +Here is a simple example which shows how SOURCE can be used to disambiguate +between two important cases. Suppose we have these modules: + +\begin{verbatim} +-- A1.hs +import {-# SOURCE #-} B + +-- A2.hs +import B + +-- B.hs +x = True + +-- B.hs-boot +x :: Bool +\end{verbatim} + +Then we translate the following packages as follows: + +\begin{verbatim} +exposed-modules: A1 B +-- translates to +B :: [ x :: Bool ] +A1 = [ import B ] +B = [ x = True ] +\end{verbatim} + +but + +\begin{verbatim} +exposed-modules: A2 B +-- translates to +B = [ x = True ] +B :: [ x :: Bool ] +A2 = [ import B ] +\end{verbatim} + +The import controls placement between signature and module, and in A1 it +forces B's signature to be sorted before B's implementation (whereas in +the second section, there is no constraint so we preferentially place +the B's implementation first) + +\paragraph{Holes in the database} In the presence of holes, +\verb|build-depends| resolution becomes more complicated. First, +let's consider the case where the package we are building is +definite, but the package database contains indefinite packages with holes. +In order to maintain the linking restriction, we now have to order packages +from step (2) of the previous elaboration. We can do this by creating +a directed graph, where nodes are packages and edges are from holes to the +package which implements them. If there is a cycle, this indicates a mutually +recursive package. In the absence of cycles, a topological sorting of this +graph preserves the linking invariant. + +One subtlety to consider is the fact that an entry in \verb|build-depends| +can affect how a hole is instantiated by another entry. This might be a +bit weird to users, who might like to explicitly say how holes are +filled when instantiating a package. Food for thought, surface syntax wise. + +\paragraph{Holes in the package} Actually, this is quite simple: the +ordering of includes goes as before, but some indefinite packages in the +database are less constrained as they're ``dependencies'' are fulfilled +by the holes at the top-level of this package. It's also worth noting +that some dependencies will go unresolved, since the following package +is valid: + +\begin{verbatim} +package a where + A :: ... +package b where + include a +\end{verbatim} + +\paragraph{Multiple signatures} In Backpack syntax, it's possible to +define a signature multiple times, which is necessary for mutually +recursive signatures: + +\begin{verbatim} +package a where + A :: [ data A ] + B :: [ import A; data B = B A ] + A :: [ import B; data A = A B ] +\end{verbatim} + +Critically, notice that we can see the constructors for both module B and A +after the signatures are linked together. This is not possible in GHC +today, but could be possible by permitting multiple hs-boot files. Now +the SOURCE pragma indicating an import must \emph{disambiguate} which +hs-boot file it intends to include. This might be one way of doing it: + +\begin{verbatim} +-- A.hs-boot2 +data A + +-- B.hs-boot +import {-# SOURCE hs-boot2 #-} A + +-- A.hs-boot +import {-# SOURCE hs-boot #-} B +\end{verbatim} + +\paragraph{Explicit or implicit reexports} One annoying property of +this proposal is that, looking at the \verb|exposed-modules| list, it is +not immediately clear what source files one would expect to find in the +current package. It's not obvious what the proper way to go about doing +this is. + +\paragraph{Better syntax for SOURCE} If we enshrine the SOURCE import +as a way of solving Backpack ordering problems, it would be nice to have +some better syntax for it. One possibility is: + +\begin{verbatim} +abstract import Data.Foo +\end{verbatim} + +which makes it clear that this module is pluggable, typechecking against +a signature. Note that this only indicates how type checking should be +done: when actually compiling the module we will compile against the +interface file for the true implementation of the module. + +It's worth noting that the SOURCE annotation was originally made a +pragma because, in principle, it should have been possible to compile +some recursive modules without needing the hs-boot file at all. But if +we're moving towards boot files as signatures, this concern is less +relevant. + +\section{Type classes and type families} + +\subsection{Background} + +Before we talk about how to support type classes in Backpack, it's first +worth talking about what we are trying to achieve in the design. Most +would agree that \emph{type safety} is the cardinal law that should be +preserved (in the sense that segfaults should not be possible), but +there are many instances of ``bad behavior'' (top level mutable state, +weakening of abstraction guarantees, ambiguous instance resolution, etc) +which various Haskellers may disagree on the necessity of ruling out. + +With this in mind, it is worth summarizing what kind of guarantees are +presently given by GHC with regards to type classes and type families, +as well as characterizing the \emph{cultural} expectations of the +Haskell community. + +\paragraph{Type classes} When discussing type class systems, there are +several properties that one may talk about. +A set of instances is \emph{confluent} if, no matter what order +constraint solving is performed, GHC will terminate with a canonical set +of constraints that must be satisfied for any given use of a type class. +In other words, confluence says that we won't conclude that a program +doesn't type check just because we swapped in a different constraint +solving algorithm. + +Confluence's closely related twin is \emph{coherence} (defined in ``Type +classes: exploring the design space''). This property states that +``every different valid typing derivation of a program leads to a +resulting program that has the same dynamic semantics.'' Why could +differing typing derivations result in different dynamic semantics? The +answer is that context reduction, which picks out type class instances, +elaborates into concrete choices of dictionaries in the generated code. +Confluence is a prerequisite for coherence, since one +can hardly talk about the dynamic semantics of a program that doesn't +type check. + +In the vernacular, confluence and coherence are often incorrectly used +to refer to another related property: \emph{global uniqueness of instances}, +which states that in a fully compiled program, for any type, there is at most one +instance resolution for a given type class. Languages with local type +class instances such as Scala generally do not have this property, and +this assumption is frequently used for abstraction. + +So, what properties does GHC enforce, in practice? +In the absence of any type system extensions, GHC's employs a set of +rules (described in ``Exploring the design space'') to ensure that type +class resolution is confluent and coherent. Intuitively, it achieves +this by having a very simple constraint solving algorithm (generate +wanted constraints and solve wanted constraints) and then requiring the +set of instances to be \emph{nonoverlapping}, ensuring there is only +ever one way to solve a wanted constraint. Overlap is a +more stringent restriction than either confluence or coherence, and +via the \verb|OverlappingInstances| and \verb|IncoherentInstances|, GHC +allows a user to relax this restriction ``if they know what they're doing.'' + +Surprisingly, however, GHC does \emph{not} enforce global uniqueness of +instances. Imported instances are not checked for overlap until we +attempt to use them for instance resolution. Consider the following program: + +\begin{verbatim} +-- T.hs +data T = T +-- A.hs +import T +instance Eq T where +-- B.hs +import T +instance Eq T where +-- C.hs +import A +import B +\end{verbatim} + +When compiled with one-shot compilation, \verb|C| will not report +overlapping instances unless we actually attempt to use the \verb|Eq| +instance in C.\footnote{When using batch compilation, GHC reuses the + instance database and is actually able to detect the duplicated + instance when compiling B. But if you run it again, recompilation +avoidance skips A, and it finishes compiling! See this bug: +\url{https://ghc.haskell.org/trac/ghc/ticket/5316}} This is by +design\footnote{\url{https://ghc.haskell.org/trac/ghc/ticket/2356}}: +ensuring that there are no overlapping instances eagerly requires +eagerly reading all the interface files a module may depend on. + +We might summarize these three properties in the following manner. +Culturally, the Haskell community expects \emph{global uniqueness of instances} +to hold: the implicit global database of instances should be +confluent and coherent. GHC, however, does not enforce uniqueness of +instances: instead, it merely guarantees that the \emph{subset} of the +instance database it uses when it compiles any given module is confluent and coherent. GHC does do some +tests when an instance is declared to see if it would result in overlap +with visible instances, but the check is by no means +perfect\footnote{\url{https://ghc.haskell.org/trac/ghc/ticket/9288}}; +truly, \emph{type-class constraint resolution} has the final word. One +mitigating factor is that in the absence of \emph{orphan instances}, GHC is +guaranteed to eagerly notice when the instance database has overlap.\footnote{Assuming that the instance declaration checks actually worked\ldots} + +Clearly, the fact that GHC's lazy behavior is surprising to most +Haskellers means that the lazy check is mostly good enough: a user +is likely to discover overlapping instances one way or another. +However, it is relatively simple to construct example programs which +violate global uniqueness of instances in an observable way: + +\begin{verbatim} +-- A.hs +module A where +data U = X | Y deriving (Eq, Show) + +-- B.hs +module B where +import Data.Set +import A + +instance Ord U where +compare X X = EQ +compare X Y = LT +compare Y X = GT +compare Y Y = EQ + +ins :: U -> Set U -> Set U +ins = insert + +-- C.hs +module C where +import Data.Set +import A + +instance Ord U where +compare X X = EQ +compare X Y = GT +compare Y X = LT +compare Y Y = EQ + +ins' :: U -> Set U -> Set U +ins' = insert + +-- D.hs +module Main where +import Data.Set +import A +import B +import C + +test :: Set U +test = ins' X $ ins X $ ins Y $ empty + +main :: IO () +main = print test + +-- OUTPUT +$ ghc -Wall -XSafe -fforce-recomp --make D.hs +[1 of 4] Compiling A ( A.hs, A.o ) +[2 of 4] Compiling B ( B.hs, B.o ) + +B.hs:5:10: Warning: Orphan instance: instance [safe] Ord U +[3 of 4] Compiling C ( C.hs, C.o ) + +C.hs:5:10: Warning: Orphan instance: instance [safe] Ord U +[4 of 4] Compiling Main ( D.hs, D.o ) +Linking D ... +$ ./D +fromList [X,Y,X] +\end{verbatim} + +Locally, all type class resolution was coherent: in the subset of +instances each module had visible, type class resolution could be done +unambiguously. Furthermore, the types of \verb|ins| and \verb|ins'| +discharge type class resolution, so that in \verb|D| when the database +is now overlapping, no resolution occurs, so the error is never found. + +It is easy to dismiss this example as an implementation wart in GHC, and +continue pretending that global uniqueness of instances holds. However, +the problem with \emph{global uniqueness of instances} is that they are +inherently nonmodular: you might find yourself unable to compose two +components because they accidentally defined the same type class +instance, even though these instances are plumbed deep in the +implementation details of the components. + +As it turns out, there is already another feature in Haskell which +must enforce global uniqueness, to prevent segfaults. +We now turn to type classes' close cousin: type families. + +\paragraph{Type families} With type families, confluence is the primary +property of interest. (Coherence is not of much interest because type +families are elaborated into coercions, which don't have any +computational content.) Rather than considering what the set of +constraints we reduce to, confluence for type families considers the +reduction of type families. The overlap checks for type families +can be quite sophisticated, especially in the case of closed type +families. + +Unlike type classes, however, GHC \emph{does} check the non-overlap +of type families eagerly. The analogous program does \emph{not} type check: + +\begin{verbatim} +-- F.hs +type family F a :: * +-- A.hs +import F +type instance F Bool = Int +-- B.hs +import F +type instance F Bool = Bool +-- C.hs +import A +import B +\end{verbatim} + +The reason is that it is \emph{unsound} to ever allow any overlap +(unlike in the case of type classes where it just leads to incoherence.) +Thus, whereas one might imagine dropping the global uniqueness of instances +invariant for type classes, it is absolutely necessary to perform global +enforcement here. There's no way around it! + +\subsection{Local type classes} + +Here, we say \textbf{NO} to global uniqueness. + +This design is perhaps best discussed in relation to modular type +classes, which shares many similar properties. Instances are now +treated as first class objects (in MTCs, they are simply modules)---we +may explicitly hide or include instances for type class resolution (in +MTCs, this is done via the \verb|using| top-level declaration). This is +essentially what was sketched in Section 5 of the original Backpack +paper. As a simple example: + +\begin{verbatim} +package p where + A :: [ data T = T ] + B = [ import A; instance Eq T where ... ] + +package q where + A = [ data T = T; instance Eq T where ... ] + include p +\end{verbatim} + +Here, \verb|B| does not see the extra instance declared by \verb|A|, +because it was thinned from its signature of \verb|A| (and thus never +declared canonical.) To declare an instance without making it +canonical, it must placed in a separate (unimported) module. + +Like modular type classes, Backpack does not give rise to incoherence, +because instance visibility can only be changed at the top level module +language, where it is already considered best practice to provide +explicit signatures. Here is the example used in the Modular Type +Classes paper to demonstrate the problem: + +\begin{verbatim} +structure A = using EqInt1 in + struct ...fun f x = eq(x,x)... end +structure B = using EqInt2 in + struct ...val y = A.f(3)... end +\end{verbatim} + +Is the type of f \verb|int -> bool|, or does it have a type-class +constraint? Because type checking proceeds over the entire program, ML +could hypothetically pick either. However, ported to Haskell, the +example looks like this: + +\begin{verbatim} +EqInt1 :: [ instance Eq Int ] +EqInt2 :: [ instance Eq Int ] +A = [ + import EqInt1 + f x = x == x +] +B = [ + import EqInt2 + import A hiding (instance Eq Int) + y = f 3 +] +\end{verbatim} + +There may be ambiguity, yes, but it can be easily resolved by the +addition of a top-level type signature to \verb|f|, which is considered +best-practice anyway. Additionally, Haskell users are trained to expect +a particular inference for \verb|f| in any case (the polymorphic one). + +Here is another example which might be considered surprising: + +\begin{verbatim} +package p where + A :: [ data T = T ] + B :: [ data T = T ] + C = [ + import qualified A + import qualified B + instance Show A.T where show T = "A" + instance Show B.T where show T = "B" + x :: String + x = show A.T ++ show B.T + ] +\end{verbatim} + +In the original Backpack paper, it was implied that module \verb|C| +should not type check if \verb|A.T = B.T| (failing at link time). +However, if we set aside, for a moment, the issue that anyone who +imports \verb|C| in such a context will now have overlapping instances, +there is no reason in principle why the module itself should be +problematic. Here is the example in MTCs, which I have good word from +Derek does type check. + +\begin{verbatim} +signature SIG = sig + type t + val mk : t +end +signature SHOW = sig + type t + val show : t -> string +end +functor Example (A : SIG) (B : SIG) = + let structure ShowA : SHOW = struct + type t = A.t + fun show _ = "A" + end in + let structure ShowB : SHOW = struct + type t = B.t + fun show _ = "B" + end in + using ShowA, ShowB in + struct + val x = show A.mk ++ show B.mk + end : sig val x : string end +\end{verbatim} + +The moral of the story is, even though in a later context the instances +are overlapping, inside the functor, the type-class resolution is unambiguous +and should be done (so \verb|x = "AB"|). + +Up until this point, we've argued why MTCs and this Backpack design are similar. +However, there is an important sociological difference between modular type-classes +and this proposed scheme for Backpack. In the presentation ``Why Applicative +Functors Matter'', Derek mentions the canonical example of defining a set: + +\begin{verbatim} +signature ORD = sig type t; val cmp : t -> t -> bool end +signature SET = sig type t; type elem; + val empty : t; + val insert : elem -> t -> t ... +end +functor MkSet (X : ORD) :> SET where type elem = X.t + = struct ... end +\end{verbatim} + +This is actually very different from how sets tend to be defined in +Haskell today. If we directly encoded this in Backpack, it would +look like this: + +\begin{verbatim} +package mk-set where + X :: [ + data T + cmp :: T -> T-> Bool + ] + Set :: [ + data Set + empty :: Set + insert :: T -> Set -> Set + ] + Set = [ + import X + ... + ] +\end{verbatim} + +It's also informative to consider how MTCs would encode set as it is written +today in Haskell: + +\begin{verbatim} +signature ORD = sig type t; val cmp : t -> t -> bool end +signature SET = sig type 'a t; + val empty : 'a t; + val insert : (X : ORD) => X.t -> X.t t -> X.t t +end +struct MkSet :> SET = struct ... end +\end{verbatim} + +Here, it is clear to see that while functor instantiation occurs for +implementation, it is not occuring for types. This is a big limitation +with the Haskell approach, and it explains why Haskellers, in practice, +find global uniqueness of instances so desirable. + +Implementation-wise, this requires some subtle modifications to how we +do type class resolution. Type checking of indefinite modules works as +before, but when go to actually compile them against explicit +implementations, we need to ``forget'' that two types are equal when +doing instance resolution. This could probably be implemented by +associating type class instances with the original name that was +utilized when typechecking, so that we can resolve ambiguous matches +against types which have the same original name now that we are +compiling. + +As we've mentioned previously, this strategy is unsound for type families. + +\subsection{Globally unique} + +Here, we say \textbf{YES} to global uniqueness. + +When we require the global uniqueness of instances (either because +that's the type class design we chose, or because we're considering +the problem of type families), we will need to reject declarations like the +one cited above when \verb|A.T = B.T|: + +\begin{verbatim} +A :: [ data T ] +B :: [ data T ] +C :: [ + import qualified A + import qualified B + instance Show A.T where show T = "A" + instance Show B.T where show T = "B" +] +\end{verbatim} + +The paper mentions that a link-time check is sufficient to prevent this +case from arising. While in the previous section, we've argued why this +is actually unnecessary when local instances are allowed, the link-time +check is a good match in the case of global instances, because any +instance \emph{must} be declared in the signature. The scheme proceeds +as follows: when some instances are typechecked initially, we type check +them as if all of variable module identities were distinct. Then, when +we perform linking (we \verb|include| or we unify some module +identities), we check again if to see if we've discovered some instance +overlap. This linking check is akin to the eager check that is +performed today for type families; it would need to be implemented for +type classes as well: however, there is a twist: we are \emph{redoing} +the overlap check now that some identities have been unified. + +As an implementation trick, one could deferring the check until \verb|C| +is compiled, keeping in line with GHC's lazy ``don't check for overlap +until the use site.'' (Once again, unsound for type families.) + +\paragraph{What about module inequalities?} An older proposal was for +signatures to contain ``module inequalities'', i.e., assertions that two +modules are not equal. (Technically: we need to be able to apply this +assertion to $\beta$ module variables, since \verb|A != B| while +\verb|A.T = B.T|). Currently, Edward thinks that module inequalities +are only marginal utility with local instances (i.e., not enough to +justify the implementation cost) and not useful at all in the world of +global instances! + +With local instances, module inequalities could be useful to statically +rule out examples like \verb|show A.T ++ show B.T|. Because such uses +are not necessarily reflected in the signature, it would be a violation +of separate module development to try to divine the constraint from the +implementation itself. I claim this is of limited utility, however, because, +as we mentioned earlier, we can compile these ``incoherent'' modules perfectly +coherently. With global instances, all instances must be in the signature, so +while it might be aesthetically displeasing to have the signature impose +extra restrictions on linking identities, we can carry this out without +violating the linking restriction. + +\section{Bits and bobs} + +\subsection{Abstract type synonyms} + +In Paper Backpack, abstract type synonyms are not permitted, because GHC doesn't +understand how to deal with them. The purpose of this section is to describe +one particularly nastiness of abstract type synonyms, by way of the occurs check: + +\begin{verbatim} +A :: [ type T ] +B :: [ import qualified A; type T = [A.T] ] +\end{verbatim} + +At this point, it is illegal for \verb|A = B|, otherwise this type synonym would +fail the occurs check. This seems like pretty bad news, since every instance +of the occurs check in the type-checker could constitute a module inequality. + +\section{Open questions}\label{sec:open-questions} + +Here are open problems about the implementation which still require +hashing out. + +\begin{itemize} + + \item In Section~\ref{sec:simplifying-backpack}, we argued that we + could implement Backpack without needing a shaping pass. We're + pretty certain that this will work for typechecking and + compiling fully definite packages with no recursive linking, but + in Section~\ref{sec:typechecking-indefinite}, we described some + of the prevailing difficulties of supporting signature linking. + Renaming is not an insurmountable problem, but backwards flow of + shaping information can be, and it is unclear how best to + accommodate this. This is probably the most important problem + to overcome. + + \item In Section~\ref{sec:installing-indefinite}, a few choices for how to + store source code were pitched, however, there is not consensus on which + one is best. + + \item What is the impact of the multiplicity of PackageIds on + dependency solving in Cabal? Old questions of what to prefer + when multiple package-versions are available (Cabal originally + only needed to solve this between different versions of the same + package, preferring the oldest version), but with signatures, + there are more choices. Should there be a complex solver that + does all signature solving, or a preprocessing step that puts + things back into the original Cabal version. Authors may want + to suggest policy for what packages should actually link against + signatures (so a crypto library doesn't accidentally link + against a null cipher package). + +\end{itemize} + +\end{document} diff --git a/docs/backpack/commands-new-new.tex b/docs/backpack/commands-new-new.tex new file mode 100644 index 0000000000..1f2466e14c --- /dev/null +++ b/docs/backpack/commands-new-new.tex @@ -0,0 +1,891 @@ +%!TEX root = paper/paper.tex +\usepackage{amsmath} +\usepackage{amssymb} +\usepackage{amsthm} +\usepackage{xspace} +\usepackage{color} +\usepackage{xifthen} +\usepackage{graphicx} +\usepackage{amsbsy} +\usepackage{mathtools} +\usepackage{stmaryrd} +\usepackage{url} +\usepackage{alltt} +\usepackage{varwidth} +% \usepackage{hyperref} +\usepackage{datetime} +\usepackage{subfig} +\usepackage{array} +\usepackage{multirow} +\usepackage{xargs} +\usepackage{marvosym} % for MVAt +\usepackage{bm} % for blackboard bold semicolon + + +%% HYPERREF COLORS +\definecolor{darkred}{rgb}{.7,0,0} +\definecolor{darkgreen}{rgb}{0,.5,0} +\definecolor{darkblue}{rgb}{0,0,.5} +% \hypersetup{ +% linktoc=page, +% colorlinks=true, +% linkcolor=darkred, +% citecolor=darkgreen, +% urlcolor=darkblue, +% } + +% Coloring +\definecolor{hilite}{rgb}{0.7,0,0} +\newcommand{\hilite}[1]{\color{hilite}#1\color{black}} +\definecolor{shade}{rgb}{0.85,0.85,0.85} +\newcommand{\shade}[1]{\colorbox{shade}{\!\ensuremath{#1}\!}} + +% Misc +\newcommand\evalto{\hookrightarrow} +\newcommand\elabto{\rightsquigarrow} +\newcommand\elabtox[1]{\stackrel{#1}\rightsquigarrow} +\newcommand{\yields}{\uparrow} +\newcommand\too{\Rightarrow} +\newcommand{\nil}{\cdot} +\newcommand{\eps}{\epsilon} +\newcommand{\Ups}{\Upsilon} +\newcommand{\avoids}{\mathrel{\#}} + +\renewcommand{\vec}[1]{\overline{#1}} +\newcommand{\rname}[1]{\textsc{#1}} +\newcommand{\infrule}[3][]{% + \vspace{0.5ex} + \frac{\begin{array}{@{}c@{}}#2\end{array}}% + {\mbox{\ensuremath{#3}}}% + \ifthenelse{\isempty{#1}}% + {}% + % {\hspace{1ex}\rlap{(\rname{#1})}}% + {\hspace{1ex}(\rname{#1})}% + \vspace{0.5ex} +} +\newcommand{\infax}[2][]{\infrule[#1]{}{#2}} +\newcommand{\andalso}{\hspace{.5cm}} +\newcommand{\suchthat}{~\mathrm{s.t.}~} +\newenvironment{notes}% + {\vspace{-1.5em}\begin{itemize}\setlength\itemsep{0pt}\small}% + {\end{itemize}} +\newcommand{\macrodef}{\mathbin{\overset{\mathrm{def}}{=}}} +\newcommand{\macroiff}{\mathbin{\overset{\mathrm{def}}{\Leftrightarrow}}} + + +\newcommand{\ttt}[1]{\text{\tt #1}} +\newcommand{\ttul}{\texttt{\char 95}} +\newcommand{\ttcc}{\texttt{:\!:}} +\newcommand{\ttlb}{{\tt {\char '173}}} +\newcommand{\ttrb}{{\tt {\char '175}}} +\newcommand{\tsf}[1]{\textsf{#1}} + +% \newcommand{\secref}[1]{\S\ref{sec:#1}} +% \newcommand{\figref}[1]{Figure~\ref{fig:#1}} +\newcommand{\marginnote}[1]{\marginpar[$\ast$ {\small #1} $\ast$]% + {$\ast$ {\small #1} $\ast$}} +\newcommand{\hschange}{\marginnote{!Haskell}} +\newcommand{\TODO}{\emph{TODO}\marginnote{TODO}} +\newcommand{\parheader}[1]{\textbf{#1}\quad} + +\newcommand{\file}{\ensuremath{\mathit{file}}} +\newcommand{\mapnil}{~\mathord{\not\mapsto}} + +\newcommand{\Ckey}[1]{\textbf{\textsf{#1}}} +\newcommand{\Cent}[1]{\texttt{#1}} +% \newcommand{\Cmod}[1]{\texttt{[#1]}} +% \newcommand{\Csig}[1]{\texttt{[\ttcc{}#1]}} +\newcommand{\Cmod}[1]{=\texttt{[#1]}} +\newcommand{\Csig}[1]{~\ttcc{}~\texttt{[#1]}} +\newcommand{\Cpath}[1]{\ensuremath{\mathsf{#1}}} +\newcommand{\Cvar}[1]{\ensuremath{\mathsf{#1}}} +\newcommand{\Ccb}[1]{\text{\ttlb} {#1} \text{\ttrb}} +\newcommand{\Cpkg}[1]{\texttt{#1}} +\newcommand{\Cmv}[1]{\ensuremath{\langle #1 \rangle}} +\newcommand{\Cto}[2]{#1 \mapsto #2} +\newcommand{\Ctoo}[2]{\Cpath{#1} \mapsto \Cpath{#2}} +\newcommand{\Crm}[1]{#1 \mapnil} +\newcommand{\Crmm}[1]{\Cpath{#1} \mapnil} +\newcommand{\Cthin}[1]{\ensuremath{\langle \Ckey{only}~#1 \rangle}} +\newcommand{\Cthinn}[1]{\ensuremath{\langle \Ckey{only}~\Cpath{#1} \rangle}} +\newcommand{\Cinc}[1]{\Ckey{include}~{#1}} +\newcommand{\Cincc}[1]{\Ckey{include}~\Cpkg{#1}} +\newcommand{\Cshar}[2]{~\Ckey{where}~{#1} \equiv {#2}} +\newcommand{\Csharr}[2]{~\Ckey{where}~\Cpath{#1} \equiv \Cpath{#2}} +\newcommand{\Ctshar}[2]{~\Ckey{where}~{#1} \equiv {#2}} +\newcommand{\Ctsharr}[3]{~\Ckey{where}~\Cpath{#1}.\Cent{#3} \equiv \Cpath{#2}.\Cent{#3}} +\newcommand{\Cbinds}[1]{\left\{\!\begin{array}{l} #1 \end{array}\!\right\}} +\newcommand{\Cbindsp}[1]{\left(\!\begin{array}{l} #1 \end{array}\!\right)} +\newcommand{\Cpkgs}[1]{\[\begin{array}{l} #1\end{array}\]} +\newcommand{\Cpkgsl}[1]{\noindent\ensuremath{\begin{array}{@{}l} #1\end{array}}} +\newcommand{\Ccomment}[1]{\ttt{\emph{--~#1}}} +\newcommand{\Cimp}[1]{\Ckey{import}~\Cpkg{#1}} +\newcommand{\Cimpas}[2]{\Ckey{import}~\Cpkg{#1}~\Ckey{as}~\Cvar{#2}} + +\newcommand{\Ctbinds}[1]{\left\{\!\vrule width 0.6pt \begin{array}{l} #1 \end{array} \vrule width 0.6pt \!\right\}} +\newcommand{\Ctbindsx}{\left\{\!\vrule width 0.6pt \; \vrule width 0.6pt \!\right\}} +\newcommand{\Ctbindsxx}{\left\{\!\vrule width 0.6pt \begin{array}{l}\!\!\!\!\\\!\!\!\!\end{array} \vrule width 0.6pt \!\right\}} +\newcommand{\Ctbindsxxx}{\left\{\!\vrule width 0.6pt \begin{array}{l}\!\!\!\!\\\!\!\!\!\\\!\!\!\!\end{array} \vrule width 0.6pt \!\right\}} + + +\newcommand{\Cpkgdef}[2]{% + \ensuremath{ + \begin{array}{l} + \Ckey{package}~\Cpkg{#1}~\Ckey{where}\\ + \hspace{1em}\begin{array}{l} + #2 + \end{array} + \end{array}}} +\newcommand{\Cpkgdefonly}[3]{% + \ensuremath{ + \begin{array}{l} + \Ckey{package}~\Cpkg{#1}\Cvar{(#2)}~\Ckey{where}\\ + \hspace{1em}\begin{array}{l} + #3 + \end{array} + \end{array}}} +\newcommand{\Ccc}{\mathbin{\ttcc{}}} +\newcommand{\Cbinmod}[2]{\Cvar{#1} = \texttt{[#2]}} +\newcommand{\Cbinsig}[2]{\Cvar{#1} \Ccc \texttt{[#2]}} +\newcommand{\Cinconly}[2]{\Ckey{include}~\Cpkg{#1}\Cvar{(#2)}} +\newcommand{\Cimponly}[2]{\Ckey{import}~\Cpkg{#1}\Cvar{(#2)}} +\newcommand{\Cimpmv}[3]{\Ckey{import}~\Cpkg{#1}\langle\Cvar{#2}\mapsto\Cvar{#3}\rangle} + + + + + +\newcommand{\oxb}[1]{\llbracket #1 \rrbracket} +\newcommand{\coxb}[1]{\{\hspace{-.5ex}| #1 |\hspace{-.5ex}\}} +\newcommand{\coxbv}[1]{\coxb{\vec{#1}}} +\newcommand{\angb}[1]{\ensuremath{\boldsymbol\langle #1 \boldsymbol\rangle}\xspace} +\newcommand{\angbv}[1]{\angb{\vec{#1}}} +\newcommand{\aoxbl}{\ensuremath{\boldsymbol\langle\hspace{-.5ex}|}} +\newcommand{\aoxbr}{\ensuremath{|\hspace{-.5ex}\boldsymbol\rangle}\xspace} +\newcommand{\aoxb}[1]{\ensuremath{\aoxbl{#1}\aoxbr}} +\newcommand{\aoxbv}[1]{\aoxb{\vec{#1}}} +\newcommand{\poxb}[1]{\ensuremath{% + (\hspace{-.5ex}|% + #1% + |\hspace{-.5ex})}\xspace} +\newcommand{\stof}[1]{{#1}^{\star}} +% \newcommand{\stof}[1]{\ensuremath{\underline{#1}}} +\newcommand{\sh}[1]{\ensuremath{\tilde{#1}}} + + +% \newenvironment{code}[1][t]% +% {\ignorespaces\begin{varwidth}[#1]{\textwidth}\begin{alltt}}% +% {\end{alltt}\end{varwidth}\ignorespacesafterend} +% \newenvironment{codel}[1][t]% +% {\noindent\begin{varwidth}[#1]{\textwidth}\noindent\begin{alltt}}% +% {\end{alltt}\end{varwidth}\ignorespacesafterend} + + +%% hack for subfloats in subfig ------------- +\makeatletter +\newbox\sf@box +\newenvironment{SubFloat}[2][]% + {\def\sf@one{#1}% + \def\sf@two{#2}% + \setbox\sf@box\hbox + \bgroup}% + {\egroup + \ifx\@empty\sf@two\@empty\relax + \def\sf@two{\@empty} + \fi + \ifx\@empty\sf@one\@empty\relax + \subfloat[\sf@two]{\box\sf@box}% + \else + \subfloat[\sf@one][\sf@two]{\box\sf@box}% + \fi} +\makeatother +%% ------------------------------------------ + +%% hack for top-aligned tabular cells ------------- +\newsavebox\topalignbox +\newcolumntype{L}{% + >{\begin{lrbox}\topalignbox + \rule{0pt}{\ht\strutbox}} + l + <{\end{lrbox}% + \raisebox{\dimexpr-\height+\ht\strutbox\relax}% + {\usebox\topalignbox}}} +\newcolumntype{C}{% + >{\begin{lrbox}\topalignbox + \rule{0pt}{\ht\strutbox}} + c + <{\end{lrbox}% + \raisebox{\dimexpr-\height+\ht\strutbox\relax}% + {\usebox\topalignbox}}} +\newcolumntype{R}{% + >{\begin{lrbox}\topalignbox + \rule{0pt}{\ht\strutbox}} + r + <{\end{lrbox}% + \raisebox{\dimexpr-\height+\ht\strutbox\relax}% + {\usebox\topalignbox}}} +%% ------------------------------------------------ + +\newcommand\syn[1]{\textsf{#1}} +\newcommand\bsyn[1]{\textsf{\bfseries #1}} +\newcommand\msyn[1]{\textsf{#1}} +\newcommand{\cc}{\mathop{::}} + +% \newcommand{\Eimp}[1]{\bsyn{import}~{#1}} +% \newcommand{\Eonly}[2]{#1~\bsyn{only}~{#2}} +% \newcommand{\Ehide}[1]{~\bsyn{hide}~{#1}} +% \newcommand{\Enew}[1]{\bsyn{new}~{#1}} +% \newcommand{\Elocal}[2]{\bsyn{local}~{#1}~\bsyn{in}~{#2}} +% \newcommand{\Smv}[3]{\Emv[]{#1}{#2}{#3}} +\newcommand{\Srm}[2]{#1 \mathord{\setminus} #2} + +\newcommand{\cpath}{\varrho} +\newcommand{\fpath}{\rho} + +\newcommand{\ie}{\emph{i.e.},\xspace} +\newcommand{\eg}{\emph{e.g.},~} +\newcommand{\etal}{\emph{et al.}} + +\renewcommand{\P}[1]{\Cpkg{#1}} +\newcommand{\X}[1]{\Cvar{#1}} +\newcommand{\E}{\mathcal{E}} +\newcommand{\C}{\mathcal{C}} +\newcommand{\M}{\mathcal{M}} +\newcommand{\B}{\mathcal{B}} +\newcommand{\R}{\mathcal{R}} +\newcommand{\K}{\mathcal{K}} +\renewcommand{\L}{\mathcal{L}} +\newcommand{\D}{\mathcal{D}} + +%%%% NEW + +\newdateformat{numericdate}{% +\THEYEAR.\twodigit{\THEMONTH}.\twodigit{\THEDAY} +} + +% EL DEFNS +\newcommand{\shal}[1]{\syn{shallow}(#1)} +\newcommand{\exports}[1]{\syn{exports}(#1)} +\newcommand{\Slocals}[1]{\syn{locals}(#1)} +\newcommand{\Slocalsi}[2]{\syn{locals}(#1; #2)} +\newcommand{\specs}[1]{\syn{specs}(#1)} +\newcommand{\ELmkespc}[2]{\syn{mkespc}(#1;#2)} +\newcommand{\Smkeenv}[1]{\syn{mkeenv}(#1)} +\newcommand{\Smklocaleenv}[2]{\syn{mklocaleenv}(#1;#2)} +\newcommand{\Smklocaleenvespcs}[1]{\syn{mklocaleenv}(#1)} +\newcommand{\Smkphnms}[1]{\syn{mkphnms}(#1)} +\newcommand{\Smkphnm}[1]{\syn{mkphnm}(#1)} +\newcommand{\Sfilterespc}[2]{\syn{filterespc}(#1;#2)} +\newcommand{\Sfilterespcs}[2]{\syn{filterespcs}(#1;#2)} +\newcommand{\Simps}[1]{\syn{imps}(#1)} + + + +% IL DEFNS +\newcommand{\dexp}{\mathit{dexp}} +\newcommand{\fexp}{\mathit{fexp}} +\newcommand{\tfexp}{\mathit{tfexp}} +\newcommand{\pexp}{\mathit{pexp}} +\newcommand{\dtyp}{\mathit{dtyp}} +\newcommand{\ftyp}{\mathit{ftyp}} +\newcommand{\hsmod}{\mathit{hsmod}} +\newcommand{\fenv}{\mathit{fenv}} +\newcommand{\ILmkmod}[6]{\syn{mkmod}(#1; #2; #3; #4; #5; #6)} +\newcommand{\ILmkstubs}[3]{\syn{mkstubs}(#1; #2; #3)} +\newcommand{\Smkstubs}[1]{\syn{mkstubs}(#1)} +\newcommand{\ILentnames}[1]{\syn{entnames}(#1)} +\newcommand{\ILmkfenv}[1]{\syn{mkfenv}(#1)} +\newcommand{\ILmkdtyp}[1]{\syn{mkdtyp}(#1)} +\newcommand{\ILmkknd}[1]{\syn{mkknd}(#1)} +\newcommand{\ILmkimpdecl}[2]{\syn{mkimpdecl}(#1;#2)} +\newcommand{\ILmkimpdecls}[2]{\syn{mkimpdecls}(#1;#2)} +\newcommand{\ILmkimpspec}[3]{\syn{mkimpspec}(#1;#2;#3)} +\newcommand{\ILmkentimp}[3]{\syn{mkentimp}(#1;#2;#3)} +\newcommand{\ILmkentimpp}[1]{\syn{mkentimp}(#1)} +\newcommand{\ILmkexp}[2]{\syn{mkexp}(#1;#2)} +\newcommand{\ILmkexpdecl}[2]{\syn{mkexpdecl}(#1;#2)} +\newcommand{\ILmkespc}[2]{\syn{mkespc}(#1;#2)} +\newcommand{\ILshal}[1]{\syn{shallow}(#1)} +\newcommand{\ILexports}[1]{\syn{exports}(#1)} +\newcommand{\ILdefns}[1]{\syn{defns}(#1)} +\newcommand{\ILdefnsi}[2]{\syn{defns}(#1;#2)} + +% CORE DEFNS +\newcommand{\Hentref}{\mathit{eref}} +\newcommand{\Hentimp}{\mathit{import}} +\newcommand{\Hentexp}{\mathit{export}} +\newcommand{\Himp}{\mathit{impdecl}} +\newcommand{\Himpspec}{\mathit{impspec}} +\newcommand{\Himps}{\mathit{impdecls}} +\newcommand{\Hexps}{\mathit{expdecl}} +\newcommand{\Hdef}{\mathit{def}} +\newcommand{\Hdefs}{\mathit{defs}} +\newcommand{\Hdecl}{\mathit{decl}} +\newcommand{\Hdecls}{\mathit{decls}} +\newcommand{\Heenv}{\mathit{eenv}} +\newcommand{\Haenv}{\mathit{aenv}} +% \newcommand{\HIL}[1]{{\scriptstyle\downarrow}#1} +\newcommand{\HIL}[1]{\check{#1}} + +\newcommand{\Hcmp}{\sqsubseteq} + +\newcommand{\uexp}{\mathit{uexp}} +\newcommand{\utyp}{\mathit{utyp}} +\newcommand{\typ}{\mathit{typ}} +\newcommand{\knd}{\mathit{knd}} +\newcommand{\kndstar}{\ttt{*}} +\newcommand{\kndarr}[2]{#1\ensuremath{\mathbin{\ttt{->}}}#2} +\newcommand{\kenv}{\mathit{kenv}} +\newcommand{\phnm}{\mathit{phnm}} +\newcommand{\spc}{\mathit{dspc}} +\newcommand{\spcs}{\mathit{dspcs}} +\newcommand{\espc}{\mathit{espc}} +\newcommand{\espcs}{\mathit{espcs}} +\newcommand{\ds}{\mathit{ds}} + +\newcommand{\shctx}{\sh{\Xi}_{\syn{ctx}}} +\newcommand{\shctxsigma}{\sh{\Sigma}_{\syn{ctx}}} + +\newcommand{\vdashsh}{\Vdash} + +% \newcommand{\vdashghc}{\vdash_{\!\!\mathrm{c}}^{\!\!\mathrm{\scriptscriptstyle EL}}} +% \newcommand{\vdashghcil}{\vdash_{\!\!\mathrm{c}}^{\!\!\mathrm{\scriptscriptstyle IL}}} +% \newcommand{\vdashshghc}{\vdashsh_{\!\!\mathrm{c}}^{\!\!\mathrm{\scriptscriptstyle EL}}} +\newcommand{\vdashghc}{\vdash_{\!\!\mathrm{c}}} +\newcommand{\vdashghcil}{\vdash_{\!\!\mathrm{c}}^{\!\!\mathrm{\scriptscriptstyle IL}}} +\newcommand{\vdashshghc}{\vdashsh_{\!\!\mathrm{c}}} + +% CORE STUFF +\newcommandx*{\JCModImp}[5][1=\sh\B, 2=\nu_0, usedefault=@]% + {#1;#2 \vdashshghc #3;#4 \elabto #5} +\newcommandx*{\JIlCModImp}[5][1=\fenv, 2=f_0, usedefault=@]% + {#1;#2 \vdashghcil #3;#4 \elabto #5} +\newcommandx*{\JCSigImp}[5][1=\sh\B, 2=\sh\tau, usedefault=@]% + {#1;#2 \vdashshghc #3;#4 \elabto #5} + +\newcommandx*{\JCImpDecl}[3][1=\sh\B, usedefault=@]% + {#1 \vdashshghc #2 \elabto #3} +\newcommandx*{\JCImp}[4][1=\sh\B, 2=p, usedefault=@]% + {#1;#2 \vdashshghc #3 \elabto #4} +\newcommandx*{\JIlCImpDecl}[3][1=\fenv, usedefault=@]% + {#1 \vdashghcil #2 \elabto #3} +\newcommandx*{\JIlCImp}[4][1=\fenv, 2=f, usedefault=@]% + {#1;#2 \vdashghcil #3 \elabto #4} + +\newcommandx*{\JCModExp}[4][1=\nu_0, 2=\Heenv, usedefault=@]% + {#1;#2 \vdashshghc #3 \elabto #4} +\newcommandx*{\JIlCModExp}[4][1=f_0, 2=\HIL\Heenv, usedefault=@]% + {#1;#2 \vdashghcil #3 \elabto #4} + +\newcommandx*{\JCModDef}[5][1=\Psi, 2=\nu_0, 3=\Heenv, usedefault=@]% + {#1; #2; #3 \vdashghcil #4 : #5} +\newcommandx*{\JIlCModDef}[5][1=\fenv, 2=f_0, 3=\HIL\Heenv, usedefault=@]% + {#1; #2; #3 \vdashghcil #4 : #5} +\newcommandx*{\JCSigDecl}[5][1=\Psi, 2=\sh\tau, 3=\Heenv, usedefault=@]% + {#1; #2; #3 \vdashghcil #4 : #5} + +\newcommandx*{\JCExp}[6][1=\sh\Psi, 2=\nu_0, 3=\Hdefs, 4=\Heenv, usedefault=@]% + {#1;#2;#3;#4 \vdashshghc #5 \elabto #6} +\newcommandx*{\JIlCExp}[4][1=f_0, 2=\HIL\Heenv, usedefault=@]% + {#1;#2 \vdashghcil #3 \elabto #4} + +\newcommandx*{\JCRefExp}[7][1=\sh\Psi, 2=\nu_0, 3=\Hdefs, 4=\Heenv, usedefault=@]% + {#1;#2;#3;#4 \vdashshghc #5 \elabto #6:#7} +\newcommandx*{\JIlCRefExp}[7][1=\fenv, 2=f_0, 3=\HIL\Hdefs, 4=\HIL\Heenv, usedefault=@]% + {#1;#2;#3;#4 \vdashghcil #5 \elabto #6:#7} + +\newcommandx*{\JCMod}[4][1=\Gamma, 2=\nu_0, usedefault=@]% + {#1; #2 \vdashghc #3 : #4} +\newcommandx*{\JIlCMod}[3][1=\fenv, usedefault=@]% + {#1 \vdashghcil #2 : #3} +\newcommandx*{\JCSig}[5][1=\Gamma, 2=\sh\tau, usedefault=@]% + {#1; #2 \vdashghc #3 \elabto #4;#5} +\newcommandx*{\JCShSig}[5][1=\Gamma, 2=\sh\tau, usedefault=@]% + {#1; #2 \vdashghc #3 \elabto #4;#5} +\newcommandx*{\JCModElab}[5][1=\Gamma, 2=\nu_0, usedefault=@]% + % {#1; #2 \vdashghc #3 : #4 \elabto #5} + {#1; #2 \vdashghc #3 : #4 \;\shade{\elabto #5}} + +\newcommandx*{\JCWfEenv}[2][1=\Haenv, usedefault=@]% + {#1 \vdashshghc #2~\syn{wf}} +\newcommandx*{\JCWfEenvMap}[2][1=\Haenv, usedefault=@]% + {#1 \vdashshghc #2~\syn{wf}} +\newcommandx*{\JIlCWfEenv}[2][1=\HIL\Haenv, usedefault=@]% + {#1 \vdashghcil #2~\syn{wf}} +\newcommandx*{\JIlCWfEenvMap}[2][1=\HIL\Haenv, usedefault=@]% + {#1 \vdashghcil #2~\syn{wf}} + +\newcommandx*{\JIlTfexp}[3][1=\fenv, 2=f_0, usedefault=@]% + {#1; #2 \vdash #3} + + + + % IL STUFF + +\newcommandx*{\JIlWf}[2][1=\fenv, usedefault=@]% + {#1 \vdash #2 ~\syn{wf}} +\newcommandx*{\JIlKnd}[4][1=\fenv, 2=\kenv, usedefault=@]% + {#1;#2 \vdashghcil #3 \mathrel{\cc} #4} +% \newcommandx*{\JIlSub}[4][1=\fenv, 2=f, usedefault=@]% +% {#1;#2 \vdash #3 \le #4} +\newcommandx*{\JIlSub}[2][usedefault=@]% + {\vdash #1 \le #2} +\newcommandx*{\JIlMerge}[3][usedefault=@]% + {\vdash #1 \oplus #2 \Rightarrow #3} + +\newcommandx*{\JIlDexp}[2][1=\fenv, usedefault=@]% + {#1 \vdash #2} +\newcommandx*{\JIlDexpTyp}[3][1=\fenv, usedefault=@]% + {#1 \vdash #2 : #3} + +\newcommandx*{\JIlWfFenv}[2][1=\nil, usedefault=@]% + {#1 \vdash #2 ~\syn{wf}} +\newcommandx*{\JIlWfFtyp}[2][1=\fenv, usedefault=@]% + {#1 \vdash #2 ~\syn{wf}} +\newcommandx*{\JIlWfSpc}[2][1=\fenv, usedefault=@]% + {#1 \vdash #2 ~\syn{wf}} +\newcommandx*{\JIlWfESpc}[2][1=\fenv, usedefault=@]% + {#1 \vdash #2 ~\syn{wf}} +\newcommandx*{\JIlWfSig}[2][1=\fenv, usedefault=@]% + {#1 \vdash #2 ~\syn{wf}} +\newcommandx*{\JIlWfFtypSpecs}[2][1=\fenv, usedefault=@]% + {#1 \vdash #2 ~\syn{specs-wf}} +\newcommandx*{\JIlWfFtypExps}[2][1=\fenv, usedefault=@]% + {#1 \vdash #2 ~\syn{exports-wf}} +\newcommandx*{\JIlWfFenvDeps}[2][1=\fenv, usedefault=@]% + {#1 \vdash #2 ~\syn{deps-wf}} + +% WF TYPE STUFF IN EL +\newcommandx*{\JPkgValid}[1]% + {\vdash #1 ~\syn{pkg-valid}} +\newcommandx*{\JWfPkgCtx}[1][1=\Delta, usedefault=@]% + {\vdash #1 ~\syn{wf}} +\newcommandx*{\JWfPhCtx}[2][1=\nil, usedefault=@]% + {#1 \vdash #2 ~\syn{wf}} +\newcommandx*{\JWfModTyp}[2][1=\Psi, usedefault=@]% + {#1 \vdash #2 ~\syn{wf}} +\newcommandx*{\JWfModTypPol}[3][1=\Psi, usedefault=@]% + {#1 \vdash #2^{#3} ~\syn{wf}} +\newcommandx*{\JWfLogSig}[2][1=\Psi, usedefault=@]% + {#1 \vdash #2 ~\syn{wf}} +\newcommandx*{\JWfSpc}[2][1=\Psi, usedefault=@]% + {#1 \vdash #2 ~\syn{wf}} +\newcommandx*{\JWfESpc}[2][1=\Psi, usedefault=@]% + {#1 \vdash #2 ~\syn{wf}} +\newcommandx*{\JWfSig}[2][1=\nil, usedefault=@]% + {#1 \vdash #2 ~\syn{wf}} +\newcommandx*{\JWfModTypSpecs}[2][1=\Psi, usedefault=@]% + {#1 \vdash #2 ~\syn{specs-wf}} +\newcommandx*{\JWfModTypPolSpecs}[3][1=\Psi, usedefault=@]% + {#1 \vdash #2^{#3} ~\syn{specs-wf}} +\newcommandx*{\JWfModTypExps}[2][1=\Psi, usedefault=@]% + {#1 \vdash #2 ~\syn{exports-wf}} +\newcommandx*{\JWfPhCtxDeps}[2][1=\Psi, usedefault=@]% + {#1 \vdash #2 ~\syn{deps-wf}} +\newcommandx*{\JWfPhCtxDepsOne}[4][1=\Psi, usedefault=@]% + {#1 \vdash \styp{#2}{#3}{#4} ~\syn{deps-wf}} + +% WF SHAPE STUFF IN EL +\newcommandx*{\JWfShPhCtx}[2][1=\nil, usedefault=@]% + {#1 \vdashsh #2 ~\syn{wf}} +\newcommandx*{\JWfModSh}[2][1=\sh\Psi, usedefault=@]% + {#1 \vdashsh #2 ~\syn{wf}} +\newcommandx*{\JWfModShPol}[3][1=\sh\Psi, usedefault=@]% + {#1 \vdashsh #2^{#3} ~\syn{wf}} +\newcommandx*{\JWfShLogSig}[2][1=\sh\Psi, usedefault=@]% + {#1 \vdashsh #2 ~\syn{wf}} +\newcommandx*{\JWfShSpc}[2][1=\sh\Psi, usedefault=@]% + {#1 \vdashsh #2 ~\syn{wf}} +\newcommandx*{\JWfShESpc}[2][1=\sh\Psi, usedefault=@]% + {#1 \vdashsh #2 ~\syn{wf}} +\newcommandx*{\JWfShSig}[2][1=\nil, usedefault=@]% + {#1 \vdashsh #2 ~\syn{wf}} +\newcommandx*{\JWfModShSpecs}[2][1=\sh\Psi, usedefault=@]% + {#1 \vdashsh #2 ~\syn{specs-wf}} +\newcommandx*{\JWfModShPolSpecs}[3][1=\sh\Psi, usedefault=@]% + {#1 \vdashsh #2^{#3} ~\syn{specs-wf}} +\newcommandx*{\JWfModShExps}[2][1=\sh\Psi, usedefault=@]% + {#1 \vdashsh #2 ~\syn{exports-wf}} +\newcommandx*{\JWfEenv}[4][1=\sh\Psi, 2=\nu_0, 3=\Hdefs, usedefault=@]% + {#1;#2;#3 \vdashshghc #4 ~\syn{wf}} + +\newcommandx*{\JCoreKnd}[4][1=\Psi, 2=\kenv, usedefault=@]% + {#1;#2 \vdashghc #3 \mathrel{\cc} #4} + +\newcommandx*{\JStampEq}[2]% + {\vdash #1 \equiv #2} +\newcommandx*{\JStampNeq}[2]% + {\vdash #1 \not\equiv #2} +\newcommandx*{\JUnif}[3]% + {\syn{unify}(#1 \doteq #2) \elabto #3} +\newcommandx*{\JUnifM}[2]% + {\syn{unify}(#1) \elabto #2} + +\newcommandx*{\JModTypWf}[1]% + {\vdash #1 ~\syn{wf}} + +\newcommandx*{\JModSub}[2]% + {\vdash #1 \le #2} +\newcommandx*{\JModSup}[2]% + {\vdash #1 \ge #2} +\newcommandx*{\JShModSub}[2]% + {\vdashsh #1 \le #2} + +\newcommandx*{\JModEq}[2]% + {\vdash #1 \equiv #2} +% \newcommandx*{\JCShModEq}[3][3=\C]% +% {\vdashsh #1 \equiv #2 \mathbin{|} #3} + +\newcommandx*{\JETyp}[4][1=\Gamma, 2=\shctxsigma, usedefault=@]% + {#1;#2 \vdash #3 : #4} +\newcommandx*{\JETypElab}[5][1=\Gamma, 2=\shctxsigma, usedefault=@]% + {\JETyp[#1][#2]{#3}{#4} \elabto #5} +\newcommandx*{\JESh}[3][1=\sh\Gamma, usedefault=@]% + {#1 \vdashsh #2 \Rightarrow #3} + +\newcommandx*{\JBTyp}[5][1=\Delta, 2=\Gamma, 3=\shctx, usedefault=@]% + {#1;#2;#3 \vdash #4 : #5} +\newcommandx*{\JBTypElab}[6][1=\Delta, 2=\Gamma, 3=\shctx, usedefault=@]% + % {\JBTyp[#1][#2][#3]{#4}{#5} \elabto #6} + {\JBTyp[#1][#2][#3]{#4}{#5} \;\shade{\elabto #6}} +\newcommandx*{\JBSh}[4][1=\Delta, 2=\sh\Gamma, usedefault=@]% + {#1;#2 \vdashsh #3 \Rightarrow #4} + +\newcommandx*{\JBVTyp}[4][1=\Delta, 2=\shctx, usedefault=@]% + {#1;#2 \vdash #3 : #4} +\newcommandx*{\JBVTypElab}[5][1=\Delta, 2=\shctx, usedefault=@]% + % {\JBVTyp[#1][#2]{#3}{#4} \elabto #5} + {\JBVTyp[#1][#2]{#3}{#4} \;\shade{\elabto #5}} +\newcommandx*{\JBVSh}[4][1=\Delta, usedefault=@]% + {#1 \vdashsh #2 \Rightarrow #3;\, #4} + +\newcommandx*{\JImp}[3][1=\Gamma, usedefault=@]% + {#1 \vdashimp #2 \elabto #3} +\newcommandx*{\JShImp}[3][1=\sh\Gamma, usedefault=@]% + {#1 \vdashshimp #2 \elabto #3} + +\newcommandx*{\JGhcMod}[4]% + {#1; #2 \vdashghc #3 : #4} +\newcommandx*{\JShGhcMod}[4]% + {#1; #2 \vdashshghc #3 : #4} + +\newcommandx*{\JGhcSig}[5]% + {#1; #2 \vdashghc #3 \elabto #4;#5} +\newcommandx*{\JShGhcSig}[5]% + {#1; #2 \vdashshghc #3 \elabto #4;#5} + +\newcommandx*{\JThin}[3][1=t, usedefault=@]% + {\vdash #2 \xrightarrow{~#1~} #3} +\newcommandx*{\JShThin}[3][1=t, usedefault=@]% + {\vdashsh #2 \xrightarrow{~#1~} #3} + +\newcommandx*{\JShMatch}[3][1=\nu, usedefault=@]% + {#1 \vdash #2 \sqsubseteq #3} + +\newcommandx*{\JShTrans}[4]% + {\vdash #1 \le_{#2} #3 \elabto #4} + +\newcommandx*{\JMerge}[3]% + {\vdash #1 + #2 \Rightarrow #3} +\newcommandx*{\JShMerge}[5]% + {\vdashsh #1 + #2 \Rightarrow #3;\, #4;\, #5} +\newcommandx*{\JShMergeNew}[4]% + {\vdashsh #1 + #2 \Rightarrow #3;\, #4} +\newcommandx*{\JShMergeSimple}[3]% + {\vdashsh #1 + #2 \Rightarrow #3} + +\newcommandx*{\JDTyp}[3][1=\Delta, usedefault=@]% + {#1 \vdash #2 : #3} +\newcommandx*{\JDTypElab}[4][1=\Delta, usedefault=@]% + % {#1 \vdash #2 : #3 \elabto #4} + {#1 \vdash #2 : #3 \;\shade{\elabto #4}} + +\newcommandx*{\JTTyp}[2][1=\Delta, usedefault=@]% + {#1 \vdash #2} + +\newcommandx*{\JSound}[3][1=\Psi_\syn{ctx}, usedefault=@]% + {#1 \vdash #2 \sim #3} + +\newcommandx*{\JSoundOne}[4][1=\Psi, 2=\fenv, usedefault=@]% + {\vdash #3 \sim #4} +% \newcommand{\Smodi}[4]{\ensuremath{\oxb{=#2 \cc #3 \imps #4}^{#1}}} +\newcommand{\Smodi}[3]{\ensuremath{\oxb{=#2 \cc #3}^{#1}}} +\newcommand{\Smod}[2]{\Smodi{+}{#1}{#2}} +\newcommand{\Ssig}[2]{\Smodi{-}{#1}{#2}} +\newcommand{\Sreq}[2]{\Smodi{?}{#1}{#2}} +\newcommand{\Shole}[2]{\Smodi{\circ}{#1}{#2}} + +\newcommand{\SSmodi}[2]{\ensuremath{\oxb{=#2}^{#1}}} +\newcommand{\SSmod}[1]{\SSmodi{+}{#1}} +\newcommand{\SSsig}[1]{\SSmodi{-}{#1}} +\newcommand{\SSreq}[1]{\SSmodi{?}{#1}} +\newcommand{\SShole}[1]{\SSmodi{\circ}{#1}} + +% \newcommand{\styp}[3]{\oxb{{#1}\cc{#2}}^{#3}} +\newcommand{\styp}[3]{{#1}{:}{#2}^{#3}} +\newcommand{\stm}[2]{\styp{#1}{#2}{\scriptscriptstyle+}} +\newcommand{\sts}[2]{\styp{#1}{#2}{\scriptscriptstyle-}} + +% \newcommand{\mtypsep}{[\!]} +\newcommand{\mtypsep}{\mbox{$\bm{;}$}} +\newcommand{\mtypsepsp}{\hspace{.3em}} +\newcommand{\msh}[3]{\aoxb{#1 ~\mtypsep~ #2 ~\mtypsep~ #3}} +\newcommand{\mtyp}[3]{ + \aoxb{\mtypsepsp #1 \mtypsepsp\mtypsep\mtypsepsp + #2 \mtypsepsp\mtypsep\mtypsepsp + #3 \mtypsepsp}} +\newcommand{\bigmtyp}[3]{\ensuremath{ + \left\langle\!\vrule \begin{array}{l} + #1 ~\mtypsep \\[0pt] + #2 ~\mtypsep \\ + #3 + \end{array} \vrule\!\right\rangle +}} + + +\newcommand{\mtypm}[2]{\mtyp{#1}{#2}^{\scriptstyle+}} +\newcommand{\mtyps}[2]{\mtyp{#1}{#2}^{\scriptstyle-}} +\newcommand{\bigmtypm}[2]{\bigmtyp{#1}{#2}^{\scriptstyle+}} +\newcommand{\bigmtyps}[2]{\bigmtyp{#1}{#2}^{\scriptstyle-}} + +\newcommand{\mref}{\ensuremath{\mathit{mref}}} +\newcommand{\selfpath}{\msyn{Local}} + +% \newcommand{\Ltyp}[3]{\oxb{#1 \mathbin{\scriptstyle\MVAt} #2}^{#3}} +% \newcommand{\Ltyp}[2]{\poxb{#1 \mathbin{\scriptstyle\MVAt} #2}} +\newcommand{\Ltyp}[2]{#1 {\scriptstyle\MVAt} #2} + +\newcommand{\Sshape}[1]{\ensuremath{\syn{shape}(#1)}} +\newcommand{\Srename}[2]{\ensuremath{\syn{rename}(#1;#2)}} +\newcommand{\Scons}[2]{\ensuremath{\syn{cons}(#1;#2)}} +\newcommand{\Smkreq}[1]{\ensuremath{\syn{hide}(#1)}} +\newcommand{\Sfv}[1]{\ensuremath{\syn{fv}(#1)}} +\newcommand{\Sdom}[1]{\ensuremath{\syn{dom}(#1)}} +\newcommand{\Srng}[1]{\ensuremath{\syn{rng}(#1)}} +\newcommand{\Sdomp}[2]{\ensuremath{\syn{dom}_{#1}(#2)}} +\newcommand{\Sclos}[1]{\ensuremath{\syn{clos}(#1)}} +\newcommand{\Scloss}[2]{\ensuremath{\syn{clos}_{#1}(#2)}} +\newcommand{\Snorm}[1]{\ensuremath{\syn{norm}(#1)}} +\newcommand{\Sident}[1]{\ensuremath{\syn{ident}(#1)}} +\newcommand{\Snec}[2]{\ensuremath{\syn{nec}(#1; #2)}} +\newcommand{\Sprovs}[1]{\ensuremath{\syn{provs}(#1)}} +\newcommand{\Smkstamp}[2]{\ensuremath{\syn{mkident}(#1; #2)}} +\newcommand{\Sname}[1]{\ensuremath{\syn{name}(#1)}} +\newcommand{\Snames}[1]{\ensuremath{\syn{names}(#1)}} +\newcommand{\Sallnames}[1]{\ensuremath{\syn{allnames}(#1)}} +\newcommand{\Shassubs}[1]{\ensuremath{\syn{hasSubs}(#1)}} +\newcommand{\Snooverlap}[1]{\ensuremath{\syn{nooverlap}(#1)}} +\newcommand{\Sreduce}[2]{\ensuremath{\syn{apply}(#1; #2)}} +\newcommand{\Smkfenv}[1]{\ensuremath{\syn{mkfenv}(#1)}} +\newcommand{\Svalidspc}[2]{\ensuremath{\syn{validspc}(#1; #2)}} +\newcommand{\Srepath}[2]{\ensuremath{\syn{repath}(#1; #2)}} +\newcommand{\Smksigenv}[2]{\ensuremath{\syn{mksigenv}(#1; #2)}} +\newcommand{\Smksigshenv}[2]{\ensuremath{\syn{mksigshenv}(#1; #2)}} +\newcommand{\Squalify}[2]{\ensuremath{\syn{qualify}(#1; #2)}} +\newcommandx*{\Sdepends}[2][1=\Psi, usedefault=@]% + {\ensuremath{\syn{depends}_{#1}(#2)}} +\newcommandx*{\Sdependss}[3][1=\Psi, 2=N, usedefault=@]% + {\ensuremath{\syn{depends}_{#1;#2}(#3)}} +\newcommandx*{\Sdependsss}[4][1=\Psi, 2=V, 3=\theta, usedefault=@]% + {\ensuremath{\syn{depends}_{#1;#2;#3}(#4)}} +\newcommand{\Snormsubst}[2]{\ensuremath{\syn{norm}(#1; #2)}} + +% \newcommand{\Smergeable}[2]{\ensuremath{\syn{mergeable}(#1; #2)}} +\newcommand{\mdef}{\mathrel{\bot}} +\newcommand{\Smergeable}[2]{\ensuremath{#1 \mdef #2}} + +\newcommand{\Sstamp}[1]{\ensuremath{\syn{stamp}(#1)}} +\newcommand{\Stype}[1]{\ensuremath{\syn{type}(#1)}} + +\newcommand{\Strue}{\ensuremath{\syn{true}}} +\newcommand{\Sfalse}{\ensuremath{\syn{false}}} + +\newcommandx*{\refsstar}[2][1=\nu_0, usedefault=@]% + {\ensuremath{\syn{refs}^{\star}}_{#1}(#2)} + +\renewcommand{\merge}{\boxplus} +\newcommand{\meet}{\sqcap} + +\newcommand{\Shaslocaleenv}[3]{\ensuremath{\syn{haslocaleenv}(#1;#2;#3)}} +\newcommand{\MTvalidnewmod}[3]{\ensuremath{\syn{validnewmod}(#1;#2;#3)}} +\newcommand{\Sdisjoint}[1]{\ensuremath{\syn{disjoint}(#1)}} +\newcommand{\Sconsistent}[1]{\ensuremath{\syn{consistent}(#1)}} +\newcommand{\Slocmatch}[2]{\ensuremath{\syn{locmatch}(#1;#2)}} +\newcommand{\Sctxmatch}[2]{\ensuremath{\syn{ctxmatch}(#1;#2)}} +\newcommand{\Snolocmatch}[2]{\ensuremath{\syn{nolocmatch}(#1;#2)}} +\newcommand{\Snoctxmatch}[2]{\ensuremath{\syn{noctxmatch}(#1;#2)}} +\newcommand{\Sislocal}[2]{\ensuremath{\syn{islocal}(#1;#2)}} +\newcommand{\Slocalespcs}[2]{\ensuremath{\syn{localespcs}(#1;#2)}} + +\newcommand{\Cprod}[1]{\syn{productive}(#1)} +\newcommand{\Cnil}{\nil} +\newcommand{\id}{\syn{id}} + +\newcommand{\nui}{\nu_{\syn{i}}} +\newcommand{\taui}{\tau_{\syn{i}}} +\newcommand{\Psii}{\Psi_{\syn{i}}} + +\newcommand{\vis}{\ensuremath{\mathsf{\scriptstyle V}}} +\newcommand{\hid}{\ensuremath{\mathsf{\scriptstyle H}}} + +\newcommand{\taum}[1]{\ensuremath{\tau_{#1}^{m_{#1}}}} + +\newcommand{\sigmamod}{\sigma_{\syn{m}}} +\newcommand{\sigmaprov}{\sigma_{\syn{p}}} + +\newcommand{\Svalidsubst}[2]{\ensuremath{\syn{validsubst}(#1;#2)}} +\newcommand{\Salias}[1]{\ensuremath{\syn{alias}(#1)}} +\newcommand{\Saliases}[1]{\ensuremath{\syn{aliases}(#1)}} +\newcommand{\Simp}[1]{\ensuremath{\syn{imp}(#1)}} +\newcommand{\Styp}[1]{\ensuremath{\syn{typ}(#1)}} +\newcommand{\Spol}[1]{\ensuremath{\syn{pol}(#1)}} + +\newcommand{\stoff}{\stof{(-)}} +\newcommand{\stheta}{\stof\theta} + + +%%%%%%% FOR THE PAPER! +\newcommand{\secref}[1]{Section~\ref{sec:#1}} +\newcommand{\figref}[1]{Figure~\ref{fig:#1}} + +% typesetting for module/path names +\newcommand{\mname}[1]{\textsf{#1}} +\newcommand{\m}[1]{\mname{#1}} + +% typesetting for package names +\newcommand{\pname}[1]{\textsf{#1}} + +\newcommand{\kpm}[2]{\angb{\pname{#1}.#2}} + +% for core entities +\newcommand{\code}[1]{\texttt{#1}} +\newcommand{\core}[1]{\texttt{#1}} + +\newcommand{\req}{\bsyn{req}} +\newcommand{\hiding}[1]{\req~\m{#1}} + +\newcommand{\Emod}[1]{\ensuremath{[#1]}} +\newcommand{\Esig}[1]{\ensuremath{[\cc#1]}} +\newcommand{\Epkg}[2]{\bsyn{package}~\pname{#1}~\bsyn{where}~{#2}} +% \newcommand{\Epkgt}[3]{\bsyn{package}~{#1}~\bsyn{only}~{#2}~\bsyn{where}~{#3}} +\newcommand{\Epkgt}[3]{\bsyn{package}~\pname{#1}~{#2}~\bsyn{where}~{#3}} +\newcommand{\Einc}[1]{\bsyn{include}~\pname{#1}} +% \newcommand{\Einct}[2]{\bsyn{include}~{#1}~\bsyn{only}~{#2}} +% \newcommand{\Einctr}[3]{\bsyn{include}~{#1}~\bsyn{only}~{#2}~{#3}} +\newcommand{\Einct}[2]{\bsyn{include}~\pname{#1}~(#2)} +\newcommand{\Eincr}[2]{\bsyn{include}~\pname{#1}~\angb{#2}} +\newcommand{\Einctr}[3]{\bsyn{include}~\pname{#1}~(#2)~\angb{#3}} +\newcommand{\Emv}[2]{#1 \mapsto #2} +\newcommand{\Emvp}[2]{\m{#1} \mapsto \m{#2}} +\newcommand{\Etr}[3][~]{{#2}{#1}\langle #3 \rangle} +\newcommand{\Erm}[3][~]{{#2}{#1}\langle #3 \mapnil \rangle} +\newcommand{\Ethin}[1]{(#1)} +\newcommand{\Ethinn}[2]{(#1; #2)} + + +% \newcommand{\Pdef}[2]{\ensuremath{\begin{array}{l} \Phead{#1} #2\end{array}}} +% \newcommand{\Phead}[1]{\bsyn{package}~\pname{#1}~\bsyn{where} \\} +% \newcommand{\Pbndd}[2]{\hspace{1em}{#1} = {#2} \\} +% \newcommand{\Pbnd}[2]{\hspace{1em}\mname{#1} = {#2} \\} +% \newcommand{\Pref}[2]{\hspace{1em}\mname{#1} = \mname{#2} \\} +% \newcommand{\Pmod}[2]{\hspace{1em}\mname{#1} = [\code{#2}] \\} +% \newcommand{\Psig}[2]{\hspace{1em}\mname{#1} \cc [\code{#2}] \\} +\newcommand{\Pdef}[2]{\ensuremath{ + \begin{array}{@{\hspace{1em}}L@{\;\;}c@{\;\;}l} + \multicolumn{3}{@{}l}{\Phead{#1}} \\ + #2 + \end{array} +}} +\newcommand{\Pdeft}[3]{\ensuremath{ + \begin{array}{@{\hspace{1em}}L@{\;\;}c@{\;\;}l} + \multicolumn{3}{@{}l}{\Pheadt{#1}{#2}} \\ + #3 + \end{array} +}} +\newcommand{\Phead}[1]{\bsyn{package}~\pname{#1}~\bsyn{where}} +\newcommand{\Pheadt}[2]{\bsyn{package}~\pname{#1}~(#2)~\bsyn{where}} +\newcommand{\Pbnd}[2]{#1 &=& #2 \\} +\newcommand{\Pref}[2]{\mname{#1} &=& \mname{#2} \\} +\newcommand{\Pmod}[2]{\mname{#1} &=& [\code{#2}] \\} +\newcommand{\Pmodd}[2]{\mname{#1} &=& #2 \\} +\newcommand{\Psig}[2]{\mname{#1} &\cc& [\code{#2}] \\} +\newcommand{\Psigg}[2]{\mname{#1} &\cc& #2 \\} +\newcommand{\Pmulti}[1]{\multicolumn{3}{@{\hspace{1em}}l} {#1} \\} +\newcommand{\Pinc}[1]{\Pmulti{\Einc{#1}}} +\newcommand{\Pinct}[2]{\Pmulti{\Einct{#1}{#2}}} +\newcommand{\Pincr}[2]{\Pmulti{\Eincr{#1}{#2}}} +\newcommand{\Pinctr}[3]{\Pmulti{\Einctr{#1}{#2}{#3}}} +\newcommand{\Pmodbig}[2]{\mname{#1} &=& \left[ + \begin{codeblock} + #2 + \end{codeblock} +\right] \\} +\newcommand{\Psigbig}[2]{\mname{#1} &\cc& \left[ + \begin{codeblock} + #2 + \end{codeblock} +\right] \\} + +\newcommand{\Mimp}[1]{\msyn{import}~\mname{#1}} +\newcommand{\Mimpq}[1]{\msyn{import}~\msyn{qualified}~\mname{#1}} +\newcommand{\Mimpas}[2]{\msyn{import}~\mname{#1}~\msyn{as}~\mname{#2}} +\newcommand{\Mimpqas}[2]{\msyn{import}~\msyn{qualified}~\mname{#1}~\msyn{as}~\mname{#2}} +\newcommand{\Mexp}[1]{\msyn{export}~(#1)} + +\newcommand{\illtyped}{\hfill ($\times$) \; ill-typed} + +\newenvironment{example}[1][LL]% + {\ignorespaces \begin{flushleft}\begin{tabular}{@{\hspace{1em}}#1} }% + {\end{tabular}\end{flushleft} \ignorespacesafterend} + +\newenvironment{counterexample}[1][LL]% + {\ignorespaces \begin{flushleft}\begin{tabular}{@{\hspace{1em}}#1} }% + {& \text{\illtyped} \end{tabular}\end{flushleft} \ignorespacesafterend} + +\newenvironment{codeblock}% + {\begin{varwidth}{\textwidth}\begin{alltt}}% + {\end{alltt}\end{varwidth}} + +\newcommand{\fighead}{\hrule\vspace{1.5ex}} +\newcommand{\figfoot}{\vspace{1ex}\hrule} +\newenvironment{myfig}{\fighead\small}{\figfoot} + +\newcommand{\Mhead}[2]{\syn{module}~{#1}~\syn{(}{#2}\syn{)}~\syn{where}} +\newcommand{\Mdef}[3]{\ensuremath{ + \begin{array}{@{\hspace{1em}}L} + \multicolumn{1}{@{}L}{\Mhead{#1}{\core{#2}}} \\ + #3 + \end{array} +}} + +\newcommand{\HMstof}[1]{\ensuremath{#1}} +% \newcommand{\HMstof}[1]{\ensuremath{\lfloor #1 \rfloor}} +% \newcommand{\HMstof}[1]{\ensuremath{\underline{#1}}} +% \newcommand{\HMstof}[1]{{#1}^{\star}} +\newcommand{\HMhead}[2]{\syn{module}~\(\HMstof{#1}\)~\syn{(}{#2}\syn{)}~\syn{where}} +\newcommand{\HMdef}[3]{\ensuremath{ + \begin{array}{@{\hspace{1em}}L} + \multicolumn{1}{@{}L}{\HMhead{#1}{\core{#2}}} \\ + #3 + \end{array} +}} +\newcommand{\HMimpas}[3]{% + \msyn{import}~\ensuremath{\HMstof{#1}}~% + \msyn{as}~\mname{#2}~\msyn{(}\core{#3}\msyn{)}} +\newcommand{\HMimpqas}[3]{% + \msyn{import}~\msyn{qualified}~\ensuremath{\HMstof{#1}}~% + \msyn{as}~\mname{#2}~\msyn{(}\core{#3}\msyn{)}} + +\newcommand{\stackedenv}[2][c]{\ensuremath{ + \begin{array}{#1} + #2 + \end{array} +}} + +% \renewcommand{\nil}{\mathsf{nil}} +\renewcommand{\nil}{\mathrel\emptyset} + +% \newcommand{\ee}{\mathit{ee}} +\newcommand{\ee}{\mathit{dent}} + +\renewcommand{\gets}{\mathbin{\coloneqq}}
\ No newline at end of file diff --git a/docs/backpack/commands-rebindings.tex b/docs/backpack/commands-rebindings.tex new file mode 100644 index 0000000000..96ad2bb2cc --- /dev/null +++ b/docs/backpack/commands-rebindings.tex @@ -0,0 +1,57 @@ + + +%% hide the full syntax of shapes/types for the paper +\newcommand{\fullmsh}[3]{\aoxb{#1 ~\mtypsep~ #2 ~\mtypsep~ #3}} +\newcommand{\fullmtyp}[3]{ + \aoxb{\mtypsepsp #1 \mtypsepsp\mtypsep\mtypsepsp + #2 \mtypsepsp\mtypsep\mtypsepsp + #3 \mtypsepsp}} +\newcommand{\fullbigmtyp}[3]{\ensuremath{ + \left\langle\!\vrule \begin{array}{l} + #1 ~\mtypsep \\[0pt] + #2 ~\mtypsep \\ + #3 + \end{array} \vrule\!\right\rangle +}} +\renewcommand{\msh}[2]{\aoxb{#1 \mtypsepsp\mtypsep\mtypsepsp #2}} +\renewcommand{\mtyp}[2]{ + \aoxb{#1 ~\mtypsep~ #2}} +\newcommand{\mtypstretch}[2]{ + \left\langle\!\vrule + \mtypsepsp #1 \mtypsepsp\mtypsep\mtypsepsp #2 \mtypsepsp + \vrule\!\right\rangle +} +\renewcommand{\bigmtyp}[2]{\ensuremath{ + \left\langle\!\vrule \begin{array}{l} + #1 ~\mtypsep \\[0pt] #2 + \end{array} \vrule\!\right\rangle +}} + + + +%% change syntax of signatures +\renewcommand{\Esig}[1]{\ensuremath{\,[#1]}} + +\renewcommandx*{\JBVSh}[3][1=\Delta, usedefault=@]% + {#1 \vdashsh #2 \Rightarrow #3} + + +% JUDGMENTS +\renewcommandx*{\JBTypElab}[6][1=\Delta, 2=\Gamma, 3=\shctx, usedefault=@]% + % {\JBTyp[#1][#2][#3]{#4}{#5} \elabto #6} + {\JBTyp[#1][#2][#3]{#4}{#5} \;\shade{\elabto #6}} +\renewcommandx*{\JBVTypElab}[5][1=\Delta, 2=\shctx, usedefault=@]% + % {\JBVTyp[#1][#2]{#3}{#4} \elabto #5} + {\JBVTyp[#1][#2]{#3}{#4} \;\shade{\elabto #5}} +\renewcommandx*{\JDTypElab}[4][1=\Delta, usedefault=@]% + % {#1 \vdash #2 : #3 \elabto #4} + {#1 \vdash #2 : #3 \;\shade{\elabto #4}} +\renewcommandx*{\JCModElab}[5][1=\Gamma, 2=\nu_0, usedefault=@]% + % {#1; #2 \vdashghc #3 : #4 \elabto #5} + {#1; #2 \vdashghc #3 : #4 \;\shade{\elabto #5}} + + +%%% Local Variables: +%%% mode: latex +%%% TeX-master: "paper" +%%% End: diff --git a/docs/backpack/diagrams.pdf b/docs/backpack/diagrams.pdf Binary files differnew file mode 100644 index 0000000000..a50916b234 --- /dev/null +++ b/docs/backpack/diagrams.pdf diff --git a/docs/backpack/diagrams.xoj b/docs/backpack/diagrams.xoj Binary files differnew file mode 100644 index 0000000000..acec8d02de --- /dev/null +++ b/docs/backpack/diagrams.xoj diff --git a/docs/backpack/pkgdb.png b/docs/backpack/pkgdb.png Binary files differnew file mode 100644 index 0000000000..9779444b42 --- /dev/null +++ b/docs/backpack/pkgdb.png diff --git a/docs/comm/exts/ndp.html b/docs/comm/exts/ndp.html deleted file mode 100644 index 2c79d728d5..0000000000 --- a/docs/comm/exts/ndp.html +++ /dev/null @@ -1,360 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - Parallel Arrays</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>The GHC Commentary - Parallel Arrays</h1> - <p> - This section describes an experimental extension by high-performance - arrays, which comprises special syntax for array types and array - comprehensions, a set of optimising program transformations, and a set - of special purpose libraries. The extension is currently only partially - implemented, but the development will be tracked here. - <p> - Parallel arrays originally got their name from the aim to provide an - architecture-independent programming model for a range of parallel - computers. However, since experiments showed that the approach is also - worthwhile for sequential array code, the emphasis has shifted to their - parallel evaluation semantics: As soon as any element in a parallel - array is demanded, all the other elements are evaluated, too. This - makes parallel arrays more strict than <a - href="http://haskell.org/onlinelibrary/array.html">standard Haskell 98 - arrays</a>, but also opens the door for a loop-based implementation - strategy that leads to significantly more efficient code. - <p> - The programming model as well as the use of the <em>flattening - transformation</em>, which is central to the approach, has its origin in - the programming language <a - href="http://www.cs.cmu.edu/~scandal/nesl.html">Nesl</a>. - - <h2>More Sugar: Special Syntax for Array Comprehensions</h2> - <p> - The option <code>-XParr</code>, which is a dynamic hsc option that can - be reversed with <code>-XNoParr</code>, enables special syntax for - parallel arrays, which is not essential to using parallel arrays, but - makes for significantly more concise programs. The switch works by - making the lexical analyser (located in <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/Lex.lhs"><code>Lex.lhs</code></a>) - recognise the tokens <code>[:</code> and <code>:]</code>. Given that - the additional productions in the parser (located in <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/Parser.y"><code>Parser.y</code></a>) - cannot be triggered without the lexer generating the necessary tokens, - there is no need to alter the behaviour of the parser. - <p> - The following additional syntax is accepted (the non-terminals are those - from the <a href="http://haskell.org/onlinereport/">Haskell 98 language - definition</a>): - <p> - <blockquote><pre> -atype -> '[:' type ':] (parallel array type) - -aexp -> '[:' exp1 ',' ... ',' expk ':]' (explicit array, k >= 0) - | '[:' exp1 [',' exp2] '..' exp3 ':]' (arithmetic array sequence) - | '[:' exp '|' quals1 '|' ... '|' qualsn ':]' (array comprehension, n >= 1) - -quals -> qual1 ',' ... ',' qualn (qualifier list, n >= 1) - -apat -> '[:' pat1 ',' ... ',' patk ':]' (array pattern, k >= 0) -</pre> - </blockquote> - <p> - Moreover, the extended comprehension syntax that allows for <em>parallel - qualifiers</em> (i.e., qualifiers separated by "<code>|</code>") is also - supported in list comprehensions. In general, the similarity to the - special syntax for list is obvious. The two main differences are that - (a) arithmetic array sequences are always finite and (b) - <code>[::]</code> is not treated as a constructor in expressions and - patterns, but rather as a special case of the explicit array syntax. - The former is a simple consequence of the parallel evaluation semantics - of parallel arrays and the latter is due to arrays not being a - constructor-based data type. - <p> - As a naming convention, types and functions that are concerned with - parallel arrays usually contain the string <code>parr</code> or - <code>PArr</code> (often as a prefix), and where corresponding types or - functions for handling lists exist, the name is identical, except for - containing the substring <code>parr</code> instead of <code>list</code> - (possibly in caps). - <p> - The following implications are worth noting explicitly: - <ul> - <li>As the value and pattern <code>[::]</code> is an empty explicit - parallel array (i.e., something of the form <code>ExplicitPArr ty - []</code> in the AST). This is in contrast to lists, which use the - nil-constructor instead. In the case of parallel arrays, using a - constructor would be rather awkward, as it is not a constructor-based - type. (This becomes rather clear in the desugarer.) - <li>As a consequence, array patterns have the general form <code>[:p1, - p2, ..., pn:]</code>, where <code>n</code> >= 0. Thus, two array - patterns overlap iff they have the same length -- an important property - for the pattern matching compiler. - </ul> - - <h2>Prelude Support for Parallel Arrays</h2> - <p> - The Prelude module <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelPArr.lhs"><code>PrelPArr</code></a> - defines the standard operations and their types on parallel arrays and - provides a basic implementation based on boxed arrays. The interface of - <code>PrelPArr</code> is oriented by H98's <code>PrelList</code>, but - leaving out all functions that require infinite structures and adding - frequently needed array operations, such as permutations. Parallel - arrays are quite unqiue in that they use an entirely different - representation as soon as the flattening transformation is activated, - which is described further below. In particular, <code>PrelPArr</code> - defines the type <code>[::]</code> and operations to create, process, - and inspect parallel arrays. The type as well as the names of some of - the operations are also hardwired in <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/TysWiredIn.lhs"><code>TysWiredIn</code></a> - (see the definition of <code>parrTyCon</code> in this module) and <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/PrelNames.lhs"><code>PrelNames</code></a>. - This is again very much like the case of lists, where the type is - defined in <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelBase.lhs"><code>PrelBase</code></a> - and similarly wired in; however, for lists the entirely - constructor-based definition is exposed to user programs, which is not - the case for parallel arrays. - - <h2>Desugaring Parallel Arrays</h2> - <p> - The parallel array extension requires the desugarer to replace all - occurrences of (1) explicit parallel arrays, (2) array patterns, and (3) - array comprehensions by a suitable combination of invocations of - operations defined in <code>PrelPArr</code>. - - <h4>Explicit Parallel Arrays</h4> - <p> - These are by far the simplest to remove. We simply replace every - occurrence of <code>[:<i>e<sub>1</sub></i>, ..., - <i>e<sub>n</sub></i>:]</code> by - <blockquote> - <code> - toP [<i>e<sub>1</sub></i>, ..., <i>e<sub>n</sub></i>] - </code> - </blockquote> - <p> - i.e., we build a list of the array elements, which is, then, converted - into a parallel array. - - <h4>Parallel Array Patterns</h4> - <p> - Array patterns are much more tricky as element positions may contain - further patterns and the <a - href="../the-beast/desugar.html#patmat">pattern matching compiler</a> - requires us to flatten all those out. But before we turn to the gory - details, here first the basic idea: A flat array pattern matches exactly - iff it's length corresponds to the length of the matched array. Hence, - if we have a set of flat array patterns matching an array value - <code>a</code>, it suffices to generate a Core <code>case</code> - expression that scrutinises <code>lengthP a</code> and has one - alternative for every length of array occuring in one of the patterns. - Moreover, there needs to be a default case catching all other array - lengths. In each alternative, array indexing (i.e., the functions - <code>!:</code>) is used to bind array elements to the corresponding - pattern variables. This sounds easy enough and is essentially what the - parallel array equation of the function <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsUtils.lhs"><code>DsUtils</code></a><code>.mkCoAlgCaseMatchResult</code> - does. - <p> - Unfortunately, however, the pattern matching compiler expects that it - can turn (almost) any pattern into variable patterns, literals, or - constructor applications by way of the functions <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Match.lhs"><code>Match</code></a><code>.tidy1</code>. - And to make matters worse, some weird machinery in the module <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Check.lhs"><code>Check</code></a> - insists on being able to reverse the process (essentially to pretty - print patterns in warnings for incomplete or overlapping patterns). - <p> - The solution to this is an (unlimited) set of <em>fake</em> constructors - for parallel arrays, courtesy of <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/TysWiredIn.lhs"><code>TysWiredIn</code></a><code>.parrFakeCon</code>. - In other words, any pattern of the form <code>[:<i>p<sub>1</sub></i>, - ..., <i>p<sub>n</sub></i>:]</code> is transformed into - <blockquote> - <code> - MkPArray<i>n</i> <i>p<sub>1</sub></i> ... <i>p<sub>n</sub></i> - </code> - </blockquote> - <p> - by <code>Match.tidy1</code>, then, run through the rest of the pattern - matching compiler, and finally, picked up by - <code>DsUtils.mkCoAlgCaseMatchResult</code>, which converts it into a - <code>case</code> expression as outlined above. - <p> - As an example consider the source expression - <blockquote><pre> -case v of - [:x1:] -> e1 - [:x2, x3, x4:] -> e2 - _ -> e3</pre> - </blockquote> - <p> - <code>Match.tidy1</code> converts it into a form that is equivalent to - <blockquote><pre> -case v of { - MkPArr1 x1 -> e1; - MkPArr2 x2 x3 x4 -> e2; - _ -> e3; -}</pre> - </blockquote> - <p> - which <code>DsUtils.mkCoAlgCaseMatchResult</code> turns into the - following Core code: - <blockquote><pre> - case lengthP v of - Int# i# -> - case i# of l { - DFT -> e3 - 1 -> let x1 = v!:0 in e1 - 3 -> let x2 = v!:0; x2 = v!:1; x3 = v!:2 in e2 - }</pre> - </blockquote> - - <h4>Parallel Array Comprehensions</h4> - <p> - The most challenging construct of the three are array comprehensions. - In principle, it would be possible to transform them in essentially the - same way as list comprehensions, but this would lead to abysmally slow - code as desugaring of list comprehensions generates code that is - optimised for sequential, constructor-based structures. In contrast, - array comprehensions need to be transformed into code that solely relies - on collective operations and avoids the creation of many small - intermediate arrays. - <p> - The transformation is implemented by the function <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsListComp.lhs"><code>DsListComp</code></a><code>.dsPArrComp</code>. - In the following, we denote this transformation function by the form - <code><<<i>e</i>>> pa ea</code>, where <code><i>e</i></code> - is the comprehension to be compiled and the arguments <code>pa</code> - and <code>ea</code> denote a pattern and the currently processed array - expression, respectively. The invariant constraining these two - arguments is that all elements in the array produced by <code>ea</code> - will <em>successfully</em> match against <code>pa</code>. - <p> - Given a source-level comprehensions <code>[:e | qss:]</code>, we compile - it with <code><<[:e | qss:]>> () [:():]</code> using the - rules - <blockquote><pre> -<<[:e' | :]>> pa ea = mapP (\pa -> e') ea -<<[:e' | b , qs:]>> pa ea = <<[:e' | qs:]>> pa (filterP (\pa -> b) ea) -<<[:e' | p <- e, qs:]>> pa ea = - let ef = filterP (\x -> case x of {p -> True; _ -> False}) e - in - <<[:e' | qs:]>> (pa, p) (crossP ea ef) -<<[:e' | let ds, qs:]>> pa ea = - <<[:e' | qs:]>> (pa, (x_1, ..., x_n)) - (mapP (\v@pa -> (v, let ds in (x_1, ..., x_n))) ea) -where - {x_1, ..., x_n} = DV (ds) -- Defined Variables -<<[:e' | qs | qss:]>> pa ea = - <<[:e' | qss:]>> (pa, (x_1, ..., x_n)) - (zipP ea <<[:(x_1, ..., x_n) | qs:]>>) -where - {x_1, ..., x_n} = DV (qs)</pre> - </blockquote> - <p> - We assume the denotation of <code>crossP</code> to be given by - <blockquote><pre> -crossP :: [:a:] -> [:b:] -> [:(a, b):] -crossP a1 a2 = let - len1 = lengthP a1 - len2 = lengthP a2 - x1 = concatP $ mapP (replicateP len2) a1 - x2 = concatP $ replicateP len1 a2 - in - zipP x1 x2</pre> - </blockquote> - <p> - For a more efficient implementation of <code>crossP</code>, see - <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelPArr.lhs"><code>PrelPArr</code></a>. - <p> - Moreover, the following optimisations are important: - <ul> - <li>In the <code>p <- e</code> rule, if <code>pa == ()</code>, drop - it and simplify the <code>crossP ea e</code> to <code>e</code>. - <li>We assume that fusion will optimise sequences of array processing - combinators. - <li>FIXME: Do we want to have the following function? - <blockquote><pre> -mapFilterP :: (a -> Maybe b) -> [:a:] -> [:b:]</pre> - </blockquote> - <p> - Even with fusion <code>(mapP (\p -> e) . filterP (\p -> - b))</code> may still result in redundant pattern matching - operations. (Let's wait with this until we have seen what the - Simplifier does to the generated code.) - </ul> - - <h2>Doing Away With Nested Arrays: The Flattening Transformation</h2> - <p> - On the quest towards an entirely unboxed representation of parallel - arrays, the flattening transformation is the essential ingredient. GHC - uses a <a - href="http://www.cse.unsw.edu.au/~chak/papers/CK00.html">substantially - improved version</a> of the transformation whose original form was - described by Blelloch & Sabot. The flattening transformation - replaces values of type <code>[:a:]</code> as well as functions - operating on these values by alternative, more efficient data structures - and functions. - <p> - The flattening machinery is activated by the option - <code>-fflatten</code>, which is a static hsc option. It consists of - two steps: function vectorisation and array specialisation. Only the - first of those is implemented so far. If selected, the transformation - is applied to a module in Core form immediately after the <a - href="../the-beast/desugar.html">desugarer,</a> before the <a - href="../the-beast/simplifier.html">Mighty Simplifier</a> gets to do its - job. After vectorisation, the Core program can be dumped using the - option <code>-ddump-vect</code>. The is a good reason for us to perform - flattening immediately after the desugarer. In <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/HscMain.lhs"><code>HscMain</code></a><code>.hscRecomp</code> - the so-called <em>persistent compiler state</em> is maintained, which - contains all the information about imported interface files needed to - lookup the details of imported names (e.g., during renaming and type - checking). However, all this information is zapped before the - simplifier is invoked (supposedly to reduce the space-consumption of - GHC). As flattening has to get at all kinds of identifiers from Prelude - modules, we need to do it before the relevant information in the - persistent compiler state is gone. - - <p> - As flattening generally requires all libraries to be compiled for - flattening (just like profiling does), there is a <em>compiler way</em> - <code>"ndp"</code>, which can be selected using the way option code - <code>-ndp</code>. This option will automagically select - <code>-XParr</code> and <code>-fflatten</code>. - - <h4><code>FlattenMonad</code></h4> - <p> - The module <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/ndpFlatten/FlattenMonad.lhs"><code>FlattenMonad</code></a> - implements the monad <code>Flatten</code> that is used during - vectorisation to keep track of various sets of bound variables and a - variable substitution map; moreover, it provides a supply of new uniques - and allows us to look up names in the persistent compiler state (i.e., - imported identifiers). - <p> - In order to be able to look up imported identifiers in the persistent - compiler state, it is important that these identifies are included into - the free variable lists computed by the renamer. More precisely, all - names needed by flattening are included in the names produced by - <code>RnEnv.getImplicitModuleFVs</code>. To avoid putting - flattening-dependent lists of names into the renamer, the module - <code>FlattenInfo</code> exports <code>namesNeededForFlattening</code>. - - [FIXME: It might be worthwhile to document in the non-Flattening part of - the Commentary that the persistent compiler state is zapped after - desugaring and how the free variables determined by the renamer imply - which names are imported.] - - <p><small> -<!-- hhmts start --> -Last modified: Tue Feb 12 01:44:21 EST 2002 -<!-- hhmts end --> - </small> - </body> -</html> diff --git a/docs/comm/exts/th.html b/docs/comm/exts/th.html deleted file mode 100644 index 539245db74..0000000000 --- a/docs/comm/exts/th.html +++ /dev/null @@ -1,197 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - Template Haskell</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>The GHC Commentary - Template Haskell</h1> - <p> - The Template Haskell (TH) extension to GHC adds a meta-programming - facility in which all meta-level code is executed at compile time. The - design of this extension is detailed in "Template Meta-programming for - Haskell", Tim Sheard and Simon Peyton Jones, <a - href="http://portal.acm.org/toc.cfm?id=581690&type=proceeding&coll=portal&dl=ACM&part=series&WantType=proceedings&idx=unknown&title=unknown">ACM - SIGPLAN 2002 Haskell Workshop,</a> 2002. However, some of the details - changed after the paper was published. - </p> - - <h2>Meta Sugar</h2> - <p> - The extra syntax of TH (quasi-quote brackets, splices, and reification) - is handled in the module <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsMeta.hs"><code>DsMeta</code></a>. - In particular, the function <code>dsBracket</code> desugars the four - types of quasi-quote brackets (<code>[|...|]</code>, - <code>[p|...|]</code>, <code>[d|...|]</code>, and <code>[t|...|]</code>) - and <code>dsReify</code> desugars the three types of reification - operations (<code>reifyType</code>, <code>reifyDecl</code>, and - <code>reifyFixity</code>). - </p> - - <h3>Desugaring of Quasi-Quote Brackets</h3> - <p> - A term in quasi-quote brackets needs to be translated into Core code - that, when executed, yields a <em>representation</em> of that term in - the form of the abstract syntax trees defined in <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/libraries/template-haskell/Language/Haskell/TH/Syntax.hs"><code>Language.Haskell.TH.Syntax</code></a>. - Within <code>DsMeta</code>, this is achieved by four functions - corresponding to the four types of quasi-quote brackets: - <code>repE</code> (for <code>[|...|]</code>), <code>repP</code> (for - <code>[p|...|]</code>), <code>repTy</code> (for <code>[t|...|]</code>), - and <code>repTopDs</code> (for <code>[d|...|]</code>). All four of - these functions receive as an argument the GHC-internal Haskell AST of - the syntactic form that they quote (i.e., arguments of type <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/hsSyn/HsExpr.lhs"><code>HsExpr</code></a><code>.HsExpr - Name</code>, <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/hsSyn/HsPat.lhs"><code>HsPat</code></a><code>.HsPat Name</code>, - <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/hsSyn/HsTypes.lhs"><code>HsType</code></a><code>.HsType - Name</code>, and <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/hsSyn/HsDecls.lhs"><code>HsDecls</code></a><code>.HsGroup - Name</code>, respectively). - </p> - <p> - To increase the static type safety in <code>DsMeta</code>, the functions - constructing representations do not just return plain values of type <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/coreSyn/CoreSyn.lhs"><code>CoreSyn</code></a> - <code>.CoreExpr</code>; instead, <code>DsMeta</code> introduces a - parametrised type <code>Core</code> whose dummy type parameter indicates - the source-level type of the value computed by the corresponding Core - expression. All construction of Core fragments in <code>DsMeta</code> - is performed by smart constructors whose type signatures use the dummy - type parameter to constrain the contexts in which they are applicable. - For example, a function that builds a Core expression that evaluates to - a TH type representation, which has type - <code>Language.Haskell.TH.Syntax.Type</code>, would return a value of - type - </p> - <blockquote> - <pre> -Core Language.Haskell.TH.Syntax.Type</pre> - </blockquote> - - <h3>Desugaring of Reification Operators</h3> - <p> - The TH paper introduces four reification operators: - <code>reifyType</code>, <code>reifyDecl</code>, - <code>reifyFixity</code>, and <code>reifyLocn</code>. Of these, - currently (= 9 Nov 2002), only the former two are implemented. - </p> - <p> - The operator <code>reifyType</code> receives the name of a function or - data constructor as its argument and yields a representation of this - entity's type in the form of a value of type - <code>TH.Syntax.Type</code>. Similarly, <code>reifyDecl</code> receives - the name of a type and yields a representation of the type's declaration - as a value of type <code>TH.Syntax.Decl</code>. The name of the reified - entity is mapped to the GHC-internal representation of the entity by - using the function <code>lookupOcc</code> on the name. - </p> - - <h3>Representing Binding Forms</h3> - <p> - Care needs to be taken when constructing TH representations of Haskell - terms that include binding forms, such as lambda abstractions or let - bindings. To avoid name clashes, fresh names need to be generated for - all defined identifiers. This is achieved via the routine - <code>DsMeta.mkGenSym</code>, which, given a <code>Name</code>, produces - a <code>Name</code> / <code>Id</code> pair (of type - <code>GenSymBind</code>) that associates the given <code>Name</code> - with a Core identifier that at runtime will be bound to a string that - contains the fresh name. Notice the two-level nature of this - arrangement. It is necessary, as the Core code that constructs the - Haskell term representation may be executed multiple types at runtime - and it must be ensured that different names are generated in each run. - </p> - <p> - Such fresh bindings need to be entered into the meta environment (of - type <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsMonad.lhs"><code>DsMonad</code></a><code>.DsMetaEnv</code>), - which is part of the state (of type <code>DsMonad.DsEnv</code>) - maintained in the desugarer monad (of type <code>DsMonad.DsM</code>). - This is done using the function <code>DsMeta.addBinds</code>, which - extends the current environment by a list of <code>GenSymBind</code>s - and executes a subcomputation in this extended environment. Names can - be looked up in the meta environment by way of the functions - <code>DsMeta.lookupOcc</code> and <code>DsMeta.lookupBinder</code>; more - details about the difference between these two functions can be found in - the next subsection. - </p> - <p> - NB: <code>DsMeta</code> uses <code>mkGenSym</code> only when - representing terms that may be embedded into a context where names can - be shadowed. For example, a lambda abstraction embedded into an - expression can potentially shadow names defined in the context it is - being embedded into. In contrast, this can never be the case for - top-level declarations, such as data type declarations; hence, the type - variables that a parametric data type declaration abstracts over are not - being gensym'ed. As a result, variables in defining positions are - handled differently depending on the syntactic construct in which they - appear. - </p> - - <h3>Binders Versus Occurrences</h3> - <p> - Name lookups in the meta environment of the desugarer use two functions - with slightly different behaviour, namely <code>DsMeta.lookupOcc</code> - and <code>lookupBinder</code>. The module <code>DsMeta</code> contains - the following explanation as to the difference of these functions: - </p> - <blockquote> - <pre> -When we desugar [d| data T = MkT |] -we want to get - Data "T" [] [Con "MkT" []] [] -and *not* - Data "Foo:T" [] [Con "Foo:MkT" []] [] -That is, the new data decl should fit into whatever new module it is -asked to fit in. We do *not* clone, though; no need for this: - Data "T79" .... - -But if we see this: - data T = MkT - foo = reifyDecl T - -then we must desugar to - foo = Data "Foo:T" [] [Con "Foo:MkT" []] [] - -So in repTopDs we bring the binders into scope with mkGenSyms and addBinds, -but in dsReify we do not. And we use lookupOcc, rather than lookupBinder -in repTyClD and repC.</pre> - </blockquote> - <p> - This implies that <code>lookupOcc</code>, when it does not find the name - in the meta environment, uses the function <code>DsMeta.globalVar</code> - to construct the <em>original name</em> of the entity (cf. the TH paper - for more details regarding original names). This name uniquely - identifies the entity in the whole program and is in scope - <em>independent</em> of whether the user name of the same entity is in - scope or not (i.e., it may be defined in a different module without - being explicitly imported) and has the form <module>:<name>. - <strong>NB:</strong> Incidentally, the current implementation of this - mechanisms facilitates breaking any abstraction barrier. - </p> - - <h3>Known-key Names for Template Haskell</h3> - <p> - During the construction of representations, the desugarer needs to use a - large number of functions defined in the library - <code>Language.Haskell.TH.Syntax</code>. The names of these functions - need to be made available to the compiler in the way outlined <a - href="../the-beast/prelude.html">Primitives and the Prelude.</a> - Unfortunately, any change to <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/PrelNames.lhs"><code>PrelNames</code></a> - triggers a significant amount of recompilation. Hence, the names needed - for TH are defined in <code>DsMeta</code> instead (at the end of the - module). All library functions needed by TH are contained in the name - set <code>DsMeta.templateHaskellNames</code>. - </p> - - <p><small> -<!-- hhmts start --> -Last modified: Wed Nov 13 18:01:48 EST 2002 -<!-- hhmts end --> - </small> - </body> -</html> diff --git a/docs/comm/feedback.html b/docs/comm/feedback.html deleted file mode 100644 index 1da8b10f29..0000000000 --- a/docs/comm/feedback.html +++ /dev/null @@ -1,34 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - Feedback</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>Feedback</h1> - <p> - <a href="mailto:chak@cse.unsw.edu.au">I</a> welcome any feedback on the - material and in particular would appreciated comments on which parts of - the document are incomprehensible or miss explanation -- e.g., due to - the use of GHC speak that is explained nowhere (words like infotable or - so). Moreover, I would be interested to know which areas of GHC you - would like to see covered here. - <p> - For the moment is probably best if feedback is directed to - <p> - <blockquote> - <a - href="mailto:chak@cse.unsw.edu.au"><code>chak@cse.unsw.edu.au</code></a> - </blockquote> - <p> - However, if there is sufficient interest, we might consider setting up a - mailing list. - - <p><small> -<!-- hhmts start --> -Last modified: Wed Aug 8 00:11:42 EST 2001 -<!-- hhmts end --> - </small> - </body> -</html> diff --git a/docs/comm/genesis/genesis.html b/docs/comm/genesis/genesis.html deleted file mode 100644 index 2ccdf5353a..0000000000 --- a/docs/comm/genesis/genesis.html +++ /dev/null @@ -1,82 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - Outline of the Genesis</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>The GHC Commentary - Outline of the Genesis</h1> - <p> - Building GHC happens in two stages: First you have to prepare the tree - with <code>make boot</code>; and second, you build the compiler and - associated libraries with <code>make all</code>. The <code>boot</code> - stage builds some tools used during the main build process, generates - parsers and other pre-computed source, and finally computes dependency - information. There is considerable detail on the build process in GHC's - <a - href="http://ghc.haskell.org/trac/ghc/wiki/Building">Building Guide.</a> - - <h4>Debugging the Beast</h4> - <p> - If you are hacking the compiler or like to play with unstable - development versions, chances are that the compiler someday just crashes - on you. Then, it is a good idea to load the <code>core</code> into - <code>gdb</code> as usual, but unfortunately there is usually not too - much useful information. - <p> - The next step, then, is somewhat tedious. You should build a compiler - producing programs with a runtime system that has debugging turned on - and use that to build the crashing compiler. There are many sanity - checks in the RTS, which may detect inconsistency before they lead to a - crash and you may include more debugging information, which helps - <code>gdb.</code> For a RTS with debugging turned on, add the following - to <code>build.mk</code> (see also the comment in - <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/mk/config.mk.in"><code>config.mk.in</code></a> that you find when searching for - <code>GhcRtsHcOpts</code>): -<blockquote><pre> -GhcRtsHcOpts+=-optc-DDEBUG -GhcRtsCcOpts+=-g -EXTRA_LD_OPTS=-lbfd -liberty</pre></blockquote> - <p> - Then go into <code>fptools/ghc/rts</code> and <code>make clean boot && - make all</code>. With the resulting runtime system, you have to re-link - the compiler. Go into <code>fptools/ghc/compiler</code>, delete the - file <code>hsc</code> (up to version 4.08) or - <code>ghc-<version></code>, and execute <code>make all</code>. - <p> - The <code>EXTRA_LD_OPTS</code> are necessary as some of the debugging - code uses the BFD library, which in turn requires <code>liberty</code>. - I would also recommend (in 4.11 and from 5.0 upwards) adding these linker - options to the files <code>package.conf</code> and - <code>package.conf.inplace</code> in the directory - <code>fptools/ghc/driver/</code> to the <code>extra_ld_opts</code> entry - of the package <code>RTS</code>. Otherwise, you have to supply them - whenever you compile and link a program with a compiler that uses the - debugging RTS for the programs it produces. - <p> - To run GHC up to version 4.08 in <code>gdb</code>, first invoke the - compiler as usual, but pass it the option <code>-v</code>. This will - show you the exact invocation of the compiler proper <code>hsc</code>. - Run <code>hsc</code> with these options in <code>gdb</code>. The - development version 4.11 and stable releases from 5.0 on do no longer - use the Perl driver; so, you can run them directly with gdb. - <p> - <strong>Debugging a compiler during building from HC files.</strong> - If you are boot strapping the compiler on new platform from HC files and - it crashes somewhere during the build (e.g., when compiling the - libraries), do as explained above, but you may have to re-configure the - build system with <code>--enable-hc-boot</code> before re-making the - code in <code>fptools/ghc/driver/</code>. - If you do this with a compiler up to version 4.08, run the build process - with <code>make EXTRA_HC_OPTS=-v</code> to get the exact arguments with - which you have to invoke <code>hsc</code> in <code>gdb</code>. - - <p><small> -<!-- hhmts start --> -Last modified: Sun Apr 24 22:16:30 CEST 2005 -<!-- hhmts end --> - </small> - </body> -</html> diff --git a/docs/comm/genesis/makefiles.html b/docs/comm/genesis/makefiles.html deleted file mode 100644 index 7f01fb53ac..0000000000 --- a/docs/comm/genesis/makefiles.html +++ /dev/null @@ -1,51 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - Mindboggling Makefiles</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>The GHC Commentary - Mindboggling Makefiles</h1> - <p> - The size and structure of GHC's makefiles makes it quite easy to scream - out loud - in pain - during the process of tracking down problems in the - make system or when attempting to alter it. GHC's <a - href="http://ghc.haskell.org/trac/ghc/wiki/Building">Building - Guide</a> has valuable information on <a - href="http://ghc.haskell.org/trac/ghc/wiki/Building/BuildSystem">the - makefile architecture.</a> - - <h4>A maze of twisty little passages, all alike</h4> - <p> - The <code>fptools/</code> toplevel and the various project directories - contain not only a <code>Makefile</code> each, but there are - subdirectories of name <code>mk/</code> at various levels that contain - rules, targets, and so on specific to a project - or, in the case of the - toplevel, the default rules for the whole system. Each <code>mk/</code> - directory contains a file <code>boilerplate.mk</code> that ties the - various other makefiles together. Files called <code>target.mk</code>, - <code>paths.mk</code>, and <code>suffix.mk</code> contain make targets, - definitions of variables containing paths, and suffix rules, - respectively. - <p> - One particularly nasty trick used in this hierarchy of makefiles is the - way in which the variable <code>$(TOP)</code> is used. AFAIK, - <code>$(TOP)</code> always points to a directory containing an - <code>mk/</code> subdirectory; however, it not necessarily points to the - toplevel <code>fptools/</code> directory. For example, within the GHC - subtree, <code>$(TOP)</code> points to <code>fptools/ghc/</code>. - However, some of the makefiles in <code>fptools/ghc/mk/</code> will then - <em>temporarily</em> redefine <code>$(TOP)</code> to point a level - higher (i.e., to <code>fptools/</code>) while they are including the - toplevel boilerplate. After that <code>$(TOP)</code> is redefined to - whatever value it had before including makefiles from higher up in the - hierarchy. - - <p><small> -<!-- hhmts start --> -Last modified: Wed Aug 22 16:46:33 GMT Daylight Time 2001 -<!-- hhmts end --> - </small> - </body> -</html> diff --git a/docs/comm/genesis/modules.html b/docs/comm/genesis/modules.html deleted file mode 100644 index 10cd7a8490..0000000000 --- a/docs/comm/genesis/modules.html +++ /dev/null @@ -1,164 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - The Marvellous Module Structure of GHC </title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>The GHC Commentary - The Marvellous Module Structure of GHC </h1> - <p> - -GHC is built out of about 245 Haskell modules. It can be quite tricky -to figure out what the module dependency graph looks like. It can be -important, too, because loops in the module dependency graph need to -be broken carefully using <tt>.hi-boot</tt> interface files. -<p> -This section of the commentary documents the subtlest part of -the module dependency graph, namely the part near the bottom. -<ul> -<li> The list is given in compilation order: that is, -module near the top are more primitive, and are compiled earlier. -<li> Each module is listed together with its most critical -dependencies in parentheses; that is, the dependencies that prevent it being -compiled earlier. -<li> Modules in the same bullet don't depend on each other. -<li> Loops are documented by a dependency such as "<tt>loop Type.Type</tt>". -This means tha the module imports <tt>Type.Type</tt>, but module <tt>Type</tt> -has not yet been compiled, so the import comes from <tt>Type.hi-boot</tt>. -</ul> - -Compilation order is as follows: -<ul> -<li> -<strong>First comes a layer of modules that have few interdependencies, -and which implement very basic data types</strong>: -<tt> <ul> -<li> Util -<li> OccName -<li> Pretty -<li> Outputable -<li> StringBuffer -<li> ListSetOps -<li> Maybes -<li> etc -</ul> </tt> - -<p> -<li> <strong> Now comes the main subtle layer, involving types, classes, type constructors -identifiers, expressions, rules, and their operations.</strong> - -<tt> -<ul> -<li> Name <br> PrimRep -<p><li> - PrelNames -<p><li> - Var (Name, loop IdInfo.IdInfo, - loop Type.Type, loop Type.Kind) -<p><li> - VarEnv, VarSet, ThinAir -<p><li> - Class (loop TyCon.TyCon, loop Type.Type) -<p><li> - TyCon (loop Type.Type, loop Type.Kind, loop DataCon.DataCon, loop Generics.GenInfo) -<p><li> - TypeRep (loop DataCon.DataCon, loop Subst.substTyWith) -<p><li> - Type (loop PprType.pprType, loop Subst.substTyWith) -<p><li> - FieldLabel(Type) <br> - TysPrim(Type) <br> -<p><li> - Literal (TysPrim, PprType) <br> - DataCon (loop PprType, loop Subst.substTyWith, FieldLabel.FieldLabel) -<p><li> - TysWiredIn (loop MkId.mkDataConIds) -<p><li> - TcType( lots of TysWiredIn stuff) -<p><li> - PprType( lots of TcType stuff ) -<p><li> - PrimOp (PprType, TysWiredIn) -<p><li> - CoreSyn [does not import Id] -<p><li> - IdInfo (CoreSyn.Unfolding, CoreSyn.CoreRules) -<p><li> - Id (lots from IdInfo) -<p><li> - CoreFVs <br> - PprCore -<p><li> - CoreUtils (PprCore.pprCoreExpr, CoreFVs.exprFreeVars, - CoreSyn.isEvaldUnfolding CoreSyn.maybeUnfoldingTemplate) -<p><li> - CoreLint( CoreUtils ) <br> - OccurAnal (CoreUtils.exprIsTrivial) <br> - CoreTidy (CoreUtils.exprArity ) <br> -<p><li> - CoreUnfold (OccurAnal.occurAnalyseGlobalExpr) -<p><li> - Subst (CoreUnfold.Unfolding, CoreFVs) <br> - Generics (CoreUnfold.mkTopUnfolding) <br> - Rules (CoreUnfold.Unfolding, PprCore.pprTidyIdRules) -<p><li> - MkId (CoreUnfold.mkUnfolding, Subst, Rules.addRule) -<p><li> - PrelInfo (MkId) <br> - HscTypes( Rules.RuleBase ) -</ul></tt> - -<p><li> <strong>That is the end of the infrastructure. Now we get the - main layer of modules that perform useful work.</strong> - -<tt><ul> -<p><li> - CoreTidy (HscTypes.PersistentCompilerState) -</ul></tt> -</ul> - -HsSyn stuff -<ul> -<li> HsPat.hs-boot -<li> HsExpr.hs-boot (loop HsPat.LPat) -<li> HsTypes (loop HsExpr.HsSplice) -<li> HsBinds (HsTypes.LHsType, loop HsPat.LPat, HsExpr.pprFunBind and others) - HsLit (HsTypes.SyntaxName) -<li> HsPat (HsBinds, HsLit) - HsDecls (HsBinds) -<li> HsExpr (HsDecls, HsPat) -</ul> - - - -<h2>Library stuff: base package</h2> - -<ul> -<li> GHC.Base -<li> Data.Tuple (GHC.Base), GHC.Ptr (GHC.Base) -<li> GHC.Enum (Data.Tuple) -<li> GHC.Show (GHC.Enum) -<li> GHC.Num (GHC.Show) -<li> GHC.ST (GHC.Num), GHC.Real (GHC.Num) -<li> GHC.Arr (GHC.ST) GHC.STRef (GHC.ST) -<li> GHC.IOBase (GHC.Arr) -<li> Data.Bits (GHC.Real) -<li> Data.HashTable (Data.Bits, Control.Monad) -<li> Data.Typeable (GHC.IOBase, Data.HashTable) -<li> GHC.Weak (Data.Typeable, GHC.IOBase) -</ul> - - - <p><small> -<!-- hhmts start --> -Last modified: Wed Aug 22 16:46:33 GMT Daylight Time 2001 -<!-- hhmts end --> - </small> - </body> -</html> - - - - - diff --git a/docs/comm/index.html b/docs/comm/index.html deleted file mode 100644 index 64b9d81ff1..0000000000 --- a/docs/comm/index.html +++ /dev/null @@ -1,121 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - The Beast Explained</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>The Glasgow Haskell Compiler (GHC) Commentary [v0.17]</h1> - <p> - <!-- Contributors: Whoever makes substantial additions or changes to the - document, please add your name and keep the order alphabetic. Moreover, - please bump the version number for any substantial modification that you - check into CVS. - --> - <strong>Manuel M. T. Chakravarty</strong><br> - <strong>Sigbjorn Finne</strong><br> - <strong>Simon Marlow</strong><br> - <strong>Simon Peyton Jones</strong><br> - <strong>Julian Seward</strong><br> - <strong>Reuben Thomas</strong><br> - <br> - <p> - This document started as a collection of notes describing what <a - href="mailto:chak@cse.unsw.edu.au">I</a> learnt when poking around in - the <a href="http://haskell.org/ghc/">GHC</a> sources. During the - <i>Haskell Implementers Workshop</i> in January 2001, it was decided to - put the commentary into - <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/">GHC's CVS - repository</a> - to allow the whole developer community to add their wizardly insight to - the document. - <p> - <strong>The document is still far from being complete - help it - grow!</strong> - - <h2>Before the Show Begins</h2> - <p> - <ul> - <li><a href="feedback.html">Feedback</a> - <li><a href="others.html">Other Sources of Wisdom</a> - </ul> - - <h2>Genesis</h2> - <p> - <ul> - <li><a href="genesis/genesis.html">Outline of the Genesis</a> - <li><a href="genesis/makefiles.html">Mindboggling Makefiles</a> - <li><a href="genesis/modules.html">GHC's Marvellous Module Structure</a> - </ul> - - <h2>The Beast Dissected</h2> - <p> - <ul> - <li><a href="the-beast/coding-style.html">Coding style used in - the compiler</a> - <li><a href="the-beast/driver.html">The Glorious Driver</a> - <li><a href="the-beast/prelude.html">Primitives and the Prelude</a> - <li><a href="the-beast/syntax.html">Just Syntax</a> - <li><a href="the-beast/basicTypes.html">The Basics</a> - <li><a href="the-beast/modules.html">Modules, ModuleNames and - Packages</a> - <li><a href="the-beast/names.html">The truth about names: Names and OccNames</a> - <li><a href="the-beast/vars.html">The Real Story about Variables, Ids, - TyVars, and the like</a> - <li><a href="the-beast/data-types.html">Data types and constructors</a> - <li><a href="the-beast/renamer.html">The Glorious Renamer</a> - <li><a href="the-beast/types.html">Hybrid Types</a> - <li><a href="the-beast/typecheck.html">Checking Types</a> - <li><a href="the-beast/desugar.html">Sugar Free: From Haskell To Core</a> - <li><a href="the-beast/simplifier.html">The Mighty Simplifier</a> - <li><a href="the-beast/mangler.html">The Evil Mangler</a> - <li><a href="the-beast/alien.html">Alien Functions</a> - <li><a href="the-beast/stg.html">You Got Control: The STG-language</a> - <li><a href="the-beast/ncg.html">The Native Code Generator</a> - <li><a href="the-beast/ghci.html">GHCi</a> - <li><a href="the-beast/fexport.html">Implementation of - <code>foreign export</code></a> - <li><a href="the-beast/main.html">Compiling and running the Main module</code></a> - </ul> - - <h2>RTS & Libraries</h2> - <p> - <ul> - <li><a href="http://ghc.haskell.org/trac/ghc/wiki/Commentary/Rts/Conventions">Coding Style Guidelines</a> - <li><a href="rts-libs/stgc.html">Spineless Tagless C</a> - <li><a href="rts-libs/primitives.html">Primitives</a> - <li><a href="rts-libs/prelfound.html">Prelude Foundations</a> - <li><a href="rts-libs/prelude.html">Cunning Prelude Code</a> - <li><a href="rts-libs/foreignptr.html">On why we have <tt>ForeignPtr</tt></a> - <li><a href="rts-libs/non-blocking.html">Non-blocking I/O for Win32</a> - <li><a href="rts-libs/multi-thread.html">Supporting multi-threaded interoperation</a> - </ul> - - <h2>Extensions, or Making a Complicated System More Complicated</h2> - <p> - <ul> - <li><a href="exts/th.html">Template Haskell</a> - <li><a href="exts/ndp.html">Parallel Arrays</a> - </ul> - - <h2>The Source</h2> - <p> - The online master copy of the Commentary is at - <blockquote> - <a href="http://www.cse.unsw.edu.au/~chak/haskell/ghc/comm/">http://www.cse.unsw.edu.au/~chak/haskell/ghc/comm/</a> - </blockquote> - <p> - This online version is updated - <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/docs/comm/">from - CVS</a> - daily. - - <p><small> -<!-- hhmts start --> -Last modified: Thu May 12 19:03:42 EST 2005 -<!-- hhmts end --> - </small> - </body> -</html> diff --git a/docs/comm/others.html b/docs/comm/others.html deleted file mode 100644 index 52d87e9419..0000000000 --- a/docs/comm/others.html +++ /dev/null @@ -1,60 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - Other Sources of Wisdom</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>Other Sources of Wisdom</h1> - <p> - Believe it or not, but there are other people besides you who are - masochistic enough to study the innards of the beast. Some of the have - been kind (or cruel?) enough to share their insights with us. Here is a - probably incomplete list: - <p> - <ul> - - <li>The <a - href="http://www.cee.hw.ac.uk/~dsg/gph/docs/StgSurvival.ps.gz">STG - Survival Sheet</a> has -- according to its header -- been written by - `a poor wee soul',<sup><a href="#footnote1">1</a></sup> which - probably has been pushed into the torments of madness by the very - act of contemplating the inner workings of the STG runtime system. - This document discusses GHC's runtime system with a focus on - support for parallel processing (aka GUM). - - <li>Instructions on <a - href="http://www-users.cs.york.ac.uk/~olaf/PUBLICATIONS/extendGHC.html">Adding - an Optimisation Pass to the Glasgow Haskell Compiler</a> - have been compiled by <a - href="http://www-users.cs.york.ac.uk/~olaf/">Olaf Chitil</a>. - Unfortunately, this document is already a little aged. - - <li><a href="http://www.cs.pdx.edu/~apt/">Andrew Tolmach</a> has defined - <a href="http://www.haskell.org/ghc/docs/papers/core.ps.gz">an external - representation of - GHC's <em>Core</em> language</a> and also implemented a GHC pass - that emits the intermediate form into <code>.hcr</code> files. The - option <code>-fext-core</code> triggers GHC to emit Core code after - optimisation; in addition, <code>-fno-code</code> is often used to - stop compilation after Core has been emitted. - - <!-- Add references to other background texts listed on the GHC docu - page - --> - - </ul> - - <p><hr><p> - <sup><a name="footnote1">1</a></sup>Usually reliable sources have it that - the poor soul in question is no one less than GUM hardcore hacker <a - href="http://www.cee.hw.ac.uk/~hwloidl/">Hans-Wolfgang Loidl</a>. - - <p><small> -<!-- hhmts start --> -Last modified: Tue Nov 13 10:56:57 EST 2001 -<!-- hhmts end --> - </small> - </body> -</html> diff --git a/docs/comm/rts-libs/foreignptr.html b/docs/comm/rts-libs/foreignptr.html deleted file mode 100644 index febe9fe422..0000000000 --- a/docs/comm/rts-libs/foreignptr.html +++ /dev/null @@ -1,68 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - why we have <tt>ForeignPtr</tt></title> - </head> - - <body BGCOLOR="FFFFFF"> - - <h1>On why we have <tt>ForeignPtr</tt></h1> - - <p>Unfortunately it isn't possible to add a finalizer to a normal - <tt>Ptr a</tt>. We already have a generic finalization mechanism: - see the Weak module in package lang. But the only reliable way to - use finalizers is to attach one to an atomic heap object - that - way the compiler's optimiser can't interfere with the lifetime of - the object. - - <p>The <tt>Ptr</tt> type is really just a boxed address - it's - defined like - - <pre> -data Ptr a = Ptr Addr# -</pre> - - <p>where <tt>Addr#</tt> is an unboxed native address (just a 32- - or 64- bit word). Putting a finalizer on a <tt>Ptr</tt> is - dangerous, because the compiler's optimiser might remove the box - altogether. - - <p><tt>ForeignPtr</tt> is defined like this - - <pre> -data ForeignPtr a = ForeignPtr ForeignObj# -</pre> - - <p>where <tt>ForeignObj#</tt> is a *boxed* address, it corresponds - to a real heap object. The heap object is primitive from the - point of view of the compiler - it can't be optimised away. So it - works to attach a finalizer to the <tt>ForeignObj#</tt> (but not - to the <tt>ForeignPtr</tt>!). - - <p>There are several primitive objects to which we can attach - finalizers: <tt>MVar#</tt>, <tt>MutVar#</tt>, <tt>ByteArray#</tt>, - etc. We have special functions for some of these: eg. - <tt>MVar.addMVarFinalizer</tt>. - - <p>So a nicer interface might be something like - -<pre> -class Finalizable a where - addFinalizer :: a -> IO () -> IO () - -instance Finalizable (ForeignPtr a) where ... -instance Finalizable (MVar a) where ... -</pre> - - <p>So you might ask why we don't just get rid of <tt>Ptr</tt> and - rename <tt>ForeignPtr</tt> to <tt>Ptr</tt>. The reason for that - is just efficiency, I think. - - <p><small> -<!-- hhmts start --> -Last modified: Wed Sep 26 09:49:37 BST 2001 -<!-- hhmts end --> - </small> - </body> -</html> diff --git a/docs/comm/rts-libs/multi-thread.html b/docs/comm/rts-libs/multi-thread.html deleted file mode 100644 index 67a544be85..0000000000 --- a/docs/comm/rts-libs/multi-thread.html +++ /dev/null @@ -1,445 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> -<head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> -<title>The GHC Commentary - Supporting multi-threaded interoperation</title> -</head> -<body> -<h1>The GHC Commentary - Supporting multi-threaded interoperation</h1> -<em> -<p> -Authors: sof@galois.com, simonmar@microsoft.com<br> -Date: April 2002 -</p> -</em> -<p> -This document presents the implementation of an extension to -Concurrent Haskell that provides two enhancements: -</p> -<ul> -<li>A Concurrent Haskell thread may call an external (e.g., C) -function in a manner that's transparent to the execution/evaluation of -other Haskell threads. Section <a href="#callout">Calling out"</a> covers this. -</li> -<li> -OS threads may safely call Haskell functions concurrently. Section -<a href="#callin">"Calling in"</a> covers this. -</li> -</ul> - -<!---- *************************************** -----> -<h2 id="callout">The problem: foreign calls that block</h2> -<p> -When a Concurrent Haskell(CH) thread calls a 'foreign import'ed -function, the runtime system(RTS) has to handle this in a manner -transparent to other CH threads. That is, they shouldn't be blocked -from making progress while the CH thread executes the external -call. Presently, all threads will block. -</p> -<p> -Clearly, we have to rely on OS-level threads in order to support this -kind of concurrency. The implementation described here defines the -(abstract) OS threads interface that the RTS assumes. The implementation -currently provides two instances of this interface, one for POSIX -threads (pthreads) and one for the Win32 threads. -</p> - -<!---- *************************************** -----> -<h3>Multi-threading the RTS</h3> - -<p> -A simple and efficient way to implement non-blocking foreign calls is like this: -<ul> -<li> Invariant: only one OS thread is allowed to -execute code inside of the GHC runtime system. [There are alternate -designs, but I won't go into details on their pros and cons here.] -We'll call the OS thread that is currently running Haskell threads -the <em>Current Haskell Worker Thread</em>. -<p> -The Current Haskell Worker Thread repeatedly grabs a Haskell thread, executes it until its -time-slice expires or it blocks on an MVar, then grabs another, and executes -that, and so on. -</p> -<li> -<p> -When the Current Haskell Worker comes to execute a potentially blocking 'foreign -import', it leaves the RTS and ceases being the Current Haskell Worker, but before doing so it makes certain that -another OS worker thread is available to become the Current Haskell Worker. -Consequently, even if the external call blocks, the new Current Haskell Worker -continues execution of the other Concurrent Haskell threads. -When the external call eventually completes, the Concurrent Haskell -thread that made the call is passed the result and made runnable -again. -</p> -<p> -<li> -A pool of OS threads are constantly trying to become the Current Haskell Worker. -Only one succeeds at any moment. If the pool becomes empty, the RTS creates more workers. -<p><li> -The OS worker threads are regarded as interchangeable. A given Haskell thread -may, during its lifetime, be executed entirely by one OS worker thread, or by more than one. -There's just no way to tell. - -<p><li>If a foreign program wants to call a Haskell function, there is always a thread switch involved. -The foreign program uses thread-safe mechanisms to create a Haskell thread and make it runnable; and -the current Haskell Worker Thread exectutes it. See Section <a href="#callin">Calling in</a>. -</ul> -<p> -The rest of this section describes the mechanics of implementing all -this. There's two parts to it, one that describes how a native (OS) thread -leaves the RTS to service the external call, the other how the same -thread handles returning the result of the external call back to the -Haskell thread. -</p> - -<!---- *************************************** -----> -<h3>Making the external call</h3> - -<p> -Presently, GHC handles 'safe' C calls by effectively emitting the -following code sequence: -</p> - -<pre> - ...save thread state... - t = suspendThread(); - r = foo(arg1,...,argn); - resumeThread(t); - ...restore thread state... - return r; -</pre> - -<p> -After having squirreled away the state of a Haskell thread, -<tt>Schedule.c:suspendThread()</tt> is called which puts the current -thread on a list [<tt>Schedule.c:suspended_ccalling_threads</tt>] -containing threads that are currently blocked waiting for external calls -to complete (this is done for the purposes of finding roots when -garbage collecting). -</p> - -<p> -In addition to putting the Haskell thread on -<tt>suspended_ccalling_threads</tt>, <tt>suspendThread()</tt> now also -does the following: -</p> -<ul> -<li>Instructs the <em>Task Manager</em> to make sure that there's a -another native thread waiting in the wings to take over the execution -of Haskell threads. This might entail creating a new -<em>worker thread</em> or re-using one that's currently waiting for -more work to do. The <a href="#taskman">Task Manager</a> section -presents the functionality provided by this subsystem. -</li> - -<li>Releases its capability to execute within the RTS. By doing -so, another worker thread will become unblocked and start executing -code within the RTS. See the <a href="#capability">Capability</a> -section for details. -</li> - -<li><tt>suspendThread()</tt> returns a token which is used to -identify the Haskell thread that was added to -<tt>suspended_ccalling_threads</tt>. This is done so that once the -external call has completed, we know what Haskell thread to pull off -the <tt>suspended_ccalling_threads</tt> list. -</li> -</ul> - -<p> -Upon return from <tt>suspendThread()</tt>, the OS thread is free of -its RTS executing responsibility, and can now invoke the external -call. Meanwhile, the other worker thread that have now gained access -to the RTS will continue executing Concurrent Haskell code. Concurrent -'stuff' is happening! -</p> - -<!---- *************************************** -----> -<h3>Returning the external result</h3> - -<p> -When the native thread eventually returns from the external call, -the result needs to be communicated back to the Haskell thread that -issued the external call. The following steps takes care of this: -</p> - -<ul> -<li>The returning OS thread calls <tt>Schedule.c:resumeThread()</tt>, -passing along the token referring to the Haskell thread that made the -call we're returning from. -</li> - -<li> -The OS thread then tries to grab hold of a <em>returning worker -capability</em>, via <tt>Capability.c:grabReturnCapability()</tt>. -Until granted, the thread blocks waiting for RTS permissions. Clearly we -don't want the thread to be blocked longer than it has to, so whenever -a thread that is executing within the RTS enters the Scheduler (which -is quite often, e.g., when a Haskell thread context switch is made), -it checks to see whether it can give up its RTS capability to a -returning worker, which is done by calling -<tt>Capability.c:yieldToReturningWorker()</tt>. -</li> - -<li> -If a returning worker is waiting (the code in <tt>Capability.c</tt> -keeps a counter of the number of returning workers that are currently -blocked waiting), it is woken up and the given the RTS execution -priviledges/capabilities of the worker thread that gave up its. -</li> - -<li> -The thread that gave up its capability then tries to re-acquire -the capability to execute RTS code; this is done by calling -<tt>Capability.c:waitForWorkCapability()</tt>. -</li> - -<li> -The returning worker that was woken up will continue execution in -<tt>resumeThread()</tt>, removing its associated Haskell thread -from the <tt>suspended_ccalling_threads</tt> list and start evaluating -that thread, passing it the result of the external call. -</li> -</ul> - -<!---- *************************************** -----> -<h3 id="rts-exec">RTS execution</h3> - -<p> -If a worker thread inside the RTS runs out of runnable Haskell -threads, it goes to sleep waiting for the external calls to complete. -It does this by calling <tt>waitForWorkCapability</tt> -</p> - -<p> -The availability of new runnable Haskell threads is signalled when: -</p> - -<ul> -<li>When an external call is set up in <tt>suspendThread()</tt>.</li> -<li>When a new Haskell thread is created (e.g., whenever -<tt>Concurrent.forkIO</tt> is called from within Haskell); this is -signalled in <tt>Schedule.c:scheduleThread_()</tt>. -</li> -<li>Whenever a Haskell thread is removed from a 'blocking queue' -attached to an MVar (only?). -</li> -</ul> - -<!---- *************************************** -----> -<h2 id="callin">Calling in</h2> - -Providing robust support for having multiple OS threads calling into -Haskell is not as involved as its dual. - -<ul> -<li>The OS thread issues the call to a Haskell function by going via -the <em>Rts API</em> (as specificed in <tt>RtsAPI.h</tt>). -<li>Making the function application requires the construction of a -closure on the heap. This is done in a thread-safe manner by having -the OS thread lock a designated block of memory (the 'Rts API' block, -which is part of the GC's root set) for the short period of time it -takes to construct the application. -<li>The OS thread then creates a new Haskell thread to execute the -function application, which (eventually) boils down to calling -<tt>Schedule.c:createThread()</tt> -<li> -Evaluation is kicked off by calling <tt>Schedule.c:scheduleExtThread()</tt>, -which asks the Task Manager to possibly create a new worker (OS) -thread to execute the Haskell thread. -<li> -After the OS thread has done this, it blocks waiting for the -Haskell thread to complete the evaluation of the Haskell function. -<p> -The reason why a separate worker thread is made to evaluate the Haskell -function and not the OS thread that made the call-in via the -Rts API, is that we want that OS thread to return as soon as possible. -We wouldn't be able to guarantee that if the OS thread entered the -RTS to (initially) just execute its function application, as the -Scheduler may side-track it and also ask it to evaluate other Haskell threads. -</li> -</ul> - -<p> -<strong>Note:</strong> As of 20020413, the implementation of the RTS API -only serializes access to the allocator between multiple OS threads wanting -to call into Haskell (via the RTS API.) It does not coordinate this access -to the allocator with that of the OS worker thread that's currently executing -within the RTS. This weakness/bug is scheduled to be tackled as part of an -overhaul/reworking of the RTS API itself. - - -<!---- *************************************** -----> -<h2>Subsystems introduced/modified</h2> - -<p> -These threads extensions affect the Scheduler portions of the runtime -system. To make it more manageable to work with, the changes -introduced a couple of new RTS 'sub-systems'. This section presents -the functionality and API of these sub-systems. -</p> - -<!---- *************************************** -----> -<h3 id="#capability">Capabilities</h3> - -<p> -A Capability represent the token required to execute STG code, -and all the state an OS thread/task needs to run Haskell code: -its STG registers, a pointer to its TSO, a nursery etc. During -STG execution, a pointer to the capabilitity is kept in a -register (BaseReg). -</p> -<p> -Only in an SMP build will there be multiple capabilities, for -the threaded RTS and other non-threaded builds, there is only -one global capability, namely <tt>MainCapability</tt>. - -<p> -The Capability API is as follows: -<pre> -/* Capability.h */ -extern void initCapabilities(void); - -extern void grabReturnCapability(Mutex* pMutex, Capability** pCap); -extern void waitForWorkCapability(Mutex* pMutex, Capability** pCap, rtsBool runnable); -extern void releaseCapability(Capability* cap); - -extern void yieldToReturningWorker(Mutex* pMutex, Capability* cap); - -extern void grabCapability(Capability** cap); -</pre> - -<ul> -<li><tt>initCapabilities()</tt> initialises the subsystem. - -<li><tt>grabReturnCapability()</tt> is called by worker threads -returning from an external call. It blocks them waiting to gain -permissions to do so. - -<li><tt>waitForWorkCapability()</tt> is called by worker threads -already inside the RTS, but without any work to do. It blocks them -waiting for there to new work to become available. - -<li><tt>releaseCapability()</tt> hands back a capability. If a -'returning worker' is waiting, it is signalled that a capability -has become available. If not, <tt>releaseCapability()</tt> tries -to signal worker threads that are blocked waiting inside -<tt>waitForWorkCapability()</tt> that new work might now be -available. - -<li><tt>yieldToReturningWorker()</tt> is called by the worker thread -that's currently inside the Scheduler. It checks whether there are other -worker threads waiting to return from making an external call. If so, -they're given preference and a capability is transferred between worker -threads. One of the waiting 'returning worker' threads is signalled and made -runnable, with the other, yielding, worker blocking to re-acquire -a capability. -</ul> - -<p> -The condition variables used to implement the synchronisation between -worker consumers and providers are local to the Capability -implementation. See source for details and comments. -</p> - -<!---- *************************************** -----> -<h3 id="taskman">The Task Manager</h3> - -<p> -The Task Manager API is responsible for managing the creation of -OS worker RTS threads. When a Haskell thread wants to make an -external call, the Task Manager is asked to possibly create a -new worker thread to take over the RTS-executing capability of -the worker thread that's exiting the RTS to execute the external call. - -<p> -The Capability subsystem keeps track of idle worker threads, so -making an informed decision about whether or not to create a new OS -worker thread is easy work for the task manager. The Task manager -provides the following API: -</p> - -<pre> -/* Task.h */ -extern void startTaskManager ( nat maxTasks, void (*taskStart)(void) ); -extern void stopTaskManager ( void ); - -extern void startTask ( void (*taskStart)(void) ); -</pre> - -<ul> -<li><tt>startTaskManager()</tt> and <tt>stopTaskManager()</tt> starts -up and shuts down the subsystem. When starting up, you have the option -to limit the overall number of worker threads that can be -created. An unbounded (modulo OS thread constraints) number of threads -is created if you pass '0'. -<li><tt>startTask()</tt> is called when a worker thread calls -<tt>suspendThread()</tt> to service an external call, asking another -worker thread to take over its RTS-executing capability. It is also -called when an external OS thread invokes a Haskell function via the -<em>Rts API</em>. -</ul> - -<!---- *************************************** -----> -<h3>Native threads API</h3> - -To hide OS details, the following API is used by the task manager and -the scheduler to interact with an OS' threads API: - -<pre> -/* OSThreads.h */ -typedef <em>..OS specific..</em> Mutex; -extern void initMutex ( Mutex* pMut ); -extern void grabMutex ( Mutex* pMut ); -extern void releaseMutex ( Mutex* pMut ); - -typedef <em>..OS specific..</em> Condition; -extern void initCondition ( Condition* pCond ); -extern void closeCondition ( Condition* pCond ); -extern rtsBool broadcastCondition ( Condition* pCond ); -extern rtsBool signalCondition ( Condition* pCond ); -extern rtsBool waitCondition ( Condition* pCond, - Mutex* pMut ); - -extern OSThreadId osThreadId ( void ); -extern void shutdownThread ( void ); -extern void yieldThread ( void ); -extern int createOSThread ( OSThreadId* tid, - void (*startProc)(void) ); -</pre> - - - -<!---- *************************************** -----> -<h2>User-level interface</h2> - -To signal that you want an external call to be serviced by a separate -OS thread, you have to add the attribute <tt>threadsafe</tt> to -a foreign import declaration, i.e., - -<pre> -foreign import "bigComp" threadsafe largeComputation :: Int -> IO () -</pre> - -<p> -The distinction between 'safe' and thread-safe C calls is made -so that we may call external functions that aren't re-entrant but may -cause a GC to occur. -<p> -The <tt>threadsafe</tt> attribute subsumes <tt>safe</tt>. -</p> - -<!---- *************************************** -----> -<h2>Building the GHC RTS</h2> - -The multi-threaded extension isn't currently enabled by default. To -have it built, you need to run the <tt>fptools</tt> configure script -with the extra option <tt>--enable-threaded-rts</tt> turned on, and -then proceed to build the compiler as per normal. - -<hr> -<small> -<!-- hhmts start --> Last modified: Wed Apr 10 14:21:57 Pacific Daylight Time 2002 <!-- hhmts end --> -</small> -</body> </html> - diff --git a/docs/comm/rts-libs/non-blocking.html b/docs/comm/rts-libs/non-blocking.html deleted file mode 100644 index 627bde8d88..0000000000 --- a/docs/comm/rts-libs/non-blocking.html +++ /dev/null @@ -1,133 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - Non-blocking I/O on Win32</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>The GHC Commentary - Non-blocking I/O on Win32</h1> - <p> - -This note discusses the implementation of non-blocking I/O on -Win32 platforms. It is not implemented yet (Apr 2002), but it seems worth -capturing the ideas. Thanks to Sigbjorn for writing them. - -<h2> Background</h2> - -GHC has provided non-blocking I/O support for Concurrent Haskell -threads on platforms that provide 'UNIX-style' non-blocking I/O for -quite a while. That is, platforms that let you alter the property of a -file descriptor to instead of having a thread block performing an I/O -operation that cannot be immediately satisfied, the operation returns -back a special error code (EWOULDBLOCK.) When that happens, the CH -thread that made the blocking I/O request is put into a blocked-on-IO -state (see Foreign.C.Error.throwErrnoIfRetryMayBlock). The RTS will -in a timely fashion check to see whether I/O is again possible -(via a call to select()), and if it is, unblock the thread & have it -re-try the I/O operation. The result is that other Concurrent Haskell -threads won't be affected, but can continue operating while a thread -is blocked on I/O. -<p> -Non-blocking I/O hasn't been supported by GHC on Win32 platforms, for -the simple reason that it doesn't provide the OS facilities described -above. - -<h2>Win32 non-blocking I/O, attempt 1</h2> - -Win32 does provide something select()-like, namely the -WaitForMultipleObjects() API. It takes an array of kernel object -handles plus a timeout interval, and waits for either one (or all) of -them to become 'signalled'. A handle representing an open file (for -reading) becomes signalled once there is input available. -<p> -So, it is possible to observe that I/O is possible using this -function, but not whether there's "enough" to satisfy the I/O request. -So, if we were to mimic select() usage with WaitForMultipleObjects(), -we'd correctly avoid blocking initially, but a thread may very well -block waiting for their I/O requests to be satisified once the file -handle has become signalled. [There is a fix for this -- only read -and write one byte at a the time -- but I'm not advocating that.] - - -<h2>Win32 non-blocking I/O, attempt 2</h2> - -Asynchronous I/O on Win32 is supported via 'overlapped I/O'; that is, -asynchronous read and write requests can be made via the ReadFile() / -WriteFile () APIs, specifying position and length of the operation. -If the I/O requests cannot be handled right away, the APIs won't -block, but return immediately (and report ERROR_IO_PENDING as their -status code.) -<p> -The completion of the request can be reported in a number of ways: -<ul> - <li> synchronously, by blocking inside Read/WriteFile(). (this is the - non-overlapped case, really.) -<p> - - <li> as part of the overlapped I/O request, pass a HANDLE to an event - object. The I/O system will signal this event once the request - completed, which a waiting thread will then be able to see. -<p> - - <li> by supplying a pointer to a completion routine, which will be - called as an Asynchronous Procedure Call (APC) whenever a thread - calls a select bunch of 'alertable' APIs. -<p> - - <li> by associating the file handle with an I/O completion port. Once - the request completes, the thread servicing the I/O completion - port will be notified. -</ul> -The use of I/O completion port looks the most interesting to GHC, -as it provides a central point where all I/O requests are reported. -<p> -Note: asynchronous I/O is only fully supported by OSes based on -the NT codebase, i.e., Win9x don't permit async I/O on files and -pipes. However, Win9x does support async socket operations, and -I'm currently guessing here, console I/O. In my view, it would -be acceptable to provide non-blocking I/O support for NT-based -OSes only. -<p> -Here's the design I currently have in mind: -<ul> -<li> Upon startup, an RTS helper thread whose only purpose is to service - an I/O completion port, is created. -<p> -<li> All files are opened in 'overlapping' mode, and associated - with an I/O completion port. -<p> -<li> Overlapped I/O requests are used to implement read() and write(). -<p> -<li> If the request cannot be satisified without blocking, the Haskell - thread is put on the blocked-on-I/O thread list & a re-schedule - is made. -<p> -<li> When the completion of a request is signalled via the I/O completion - port, the RTS helper thread will move the associated Haskell thread - from the blocked list onto the runnable list. (Clearly, care - is required here to have another OS thread mutate internal Scheduler - data structures.) - -<p> -<li> In the event all Concurrent Haskell threads are blocked waiting on - I/O, the main RTS thread blocks waiting on an event synchronisation - object, which the helper thread will signal whenever it makes - a Haskell thread runnable. - -</ul> - -I might do the communication between the RTS helper thread and the -main RTS thread differently though: rather than have the RTS helper -thread manipluate thread queues itself, thus requiring careful -locking, just have it change a bit on the relevant TSO, which the main -RTS thread can check at regular intervals (in some analog of -awaitEvent(), for example). - - <p><small> -<!-- hhmts start --> -Last modified: Wed Aug 8 19:30:18 EST 2001 -<!-- hhmts end --> - </small> - </body> -</html> diff --git a/docs/comm/rts-libs/prelfound.html b/docs/comm/rts-libs/prelfound.html deleted file mode 100644 index 25407eed43..0000000000 --- a/docs/comm/rts-libs/prelfound.html +++ /dev/null @@ -1,57 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - Prelude Foundations</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>The GHC Commentary - Prelude Foundations</h1> - <p> - The standard Haskell Prelude as well as GHC's Prelude extensions are - constructed from GHC's <a href="primitives.html">primitives</a> in a - couple of layers. - - <h4><code>PrelBase.lhs</code></h4> - <p> - Some most elementary Prelude definitions are collected in <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelBase.lhs"><code>PrelBase.lhs</code></a>. - In particular, it defines the boxed versions of Haskell primitive types - - for example, <code>Int</code> is defined as - <blockquote><pre> -data Int = I# Int#</pre> - </blockquote> - <p> - Saying that a boxed integer <code>Int</code> is formed by applying the - data constructor <code>I#</code> to an <em>unboxed</em> integer of type - <code>Int#</code>. Unboxed types are hardcoded in the compiler and - exported together with the <a href="primitives.html">primitive - operations</a> understood by GHC. - <p> - <code>PrelBase.lhs</code> similarly defines basic types, such as, - boolean values - <blockquote><pre> -data Bool = False | True deriving (Eq, Ord)</pre> - </blockquote> - <p> - the unit type - <blockquote><pre> -data () = ()</pre> - </blockquote> - <p> - and lists - <blockquote><pre> -data [] a = [] | a : [a]</pre> - </blockquote> - <p> - It also contains instance delarations for these types. In addition, - <code>PrelBase.lhs</code> contains some <a href="prelude.html">tricky - machinery</a> for efficient list handling. - - <p><small> -<!-- hhmts start --> -Last modified: Wed Aug 8 19:30:18 EST 2001 -<!-- hhmts end --> - </small> - </body> -</html> diff --git a/docs/comm/rts-libs/prelude.html b/docs/comm/rts-libs/prelude.html deleted file mode 100644 index c93e90dddc..0000000000 --- a/docs/comm/rts-libs/prelude.html +++ /dev/null @@ -1,121 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - Cunning Prelude Code</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>The GHC Commentary - Cunning Prelude Code</h1> - <p> - GHC's uses a many optimisations and GHC specific techniques (unboxed - values, RULES pragmas, and so on) to make the heavily used Prelude code - as fast as possible. - - <hr> - <h4>Par, seq, and lazy</h4> - - In GHC.Conc you will dinf -<blockquote><pre> - pseq a b = a `seq` lazy b -</pre></blockquote> - What's this "lazy" thing. Well, <tt>pseq</tt> is a <tt>seq</tt> for a parallel setting. - We really mean "evaluate a, then b". But if the strictness analyser sees that pseq is strict - in b, then b might be evaluated <em>before</em> a, which is all wrong. -<p> -Solution: wrap the 'b' in a call to <tt>GHC.Base.lazy</tt>. This function is just the identity function, -except that it's put into the built-in environment in MkId.lhs. That is, the MkId.lhs defn over-rides the -inlining and strictness information that comes in from GHC.Base.hi. And that makes <tt>lazy</tt> look -lazy, and have no inlining. So the strictness analyser gets no traction. -<p> -In the worker/wrapper phase, after strictness analysis, <tt>lazy</tt> is "manually" inlined (see WorkWrap.lhs), -so we get all the efficiency back. -<p> -This supersedes an earlier scheme involving an even grosser hack in which par# and seq# returned an -Int#. Now there is no seq# operator at all. - - - <hr> - <h4>fold/build</h4> - <p> - There is a lot of magic in <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelBase.lhs"><code>PrelBase.lhs</code></a> - - among other things, the <a - href="http://haskell.cs.yale.edu/ghc/docs/latest/set/rewrite-rules.html">RULES - pragmas</a> implementing the <a - href="http://research.microsoft.com/Users/simonpj/Papers/deforestation-short-cut.ps.Z">fold/build</a> - optimisation. The code for <code>map</code> is - a good example for how it all works. In the prelude code for version - 5.03 it reads as follows: - <blockquote><pre> -map :: (a -> b) -> [a] -> [b] -map _ [] = [] -map f (x:xs) = f x : map f xs - --- Note eta expanded -mapFB :: (elt -> lst -> lst) -> (a -> elt) -> a -> lst -> lst -{-# INLINE [0] mapFB #-} -mapFB c f x ys = c (f x) ys - -{-# RULES -"map" [~1] forall f xs. map f xs = build (\c n -> foldr (mapFB c f) n xs) -"mapList" [1] forall f. foldr (mapFB (:) f) [] = map f -"mapFB" forall c f g. mapFB (mapFB c f) g = mapFB c (f.g) - #-}</pre> - </blockquote> - <p> - Up to (but not including) phase 1, we use the <code>"map"</code> rule to - rewrite all saturated applications of <code>map</code> with its - build/fold form, hoping for fusion to happen. In phase 1 and 0, we - switch off that rule, inline build, and switch on the - <code>"mapList"</code> rule, which rewrites the foldr/mapFB thing back - into plain map. - <p> - It's important that these two rules aren't both active at once - (along with build's unfolding) else we'd get an infinite loop - in the rules. Hence the activation control using explicit phase numbers. - <p> - The "mapFB" rule optimises compositions of map. - <p> - The mechanism as described above is new in 5.03 since January 2002, - where the <code>[~</code><i>N</i><code>]</code> syntax for phase number - annotations at rules was introduced. Before that the whole arrangement - was more complicated, as the corresponding prelude code for version - 4.08.1 shows: - <blockquote><pre> -map :: (a -> b) -> [a] -> [b] -map = mapList - --- Note eta expanded -mapFB :: (elt -> lst -> lst) -> (a -> elt) -> a -> lst -> lst -mapFB c f x ys = c (f x) ys - -mapList :: (a -> b) -> [a] -> [b] -mapList _ [] = [] -mapList f (x:xs) = f x : mapList f xs - -{-# RULES -"map" forall f xs. map f xs = build (\c n -> foldr (mapFB c f) n xs) -"mapFB" forall c f g. mapFB (mapFB c f) g = mapFB c (f.g) -"mapList" forall f. foldr (mapFB (:) f) [] = mapList f - #-}</pre> - </blockquote> - <p> - This code is structured as it is, because the "map" rule first - <em>breaks</em> the map <em>open,</em> which exposes it to the various - foldr/build rules, and if no foldr/build rule matches, the "mapList" - rule <em>closes</em> it again in a later phase of optimisation - after - build was inlined. As a consequence, the whole thing depends a bit on - the timing of the various optimisations (the map might be closed again - before any of the foldr/build rules fires). To make the timing - deterministic, <code>build</code> gets a <code>{-# INLINE 2 build - #-}</code> pragma, which delays <code>build</code>'s inlining, and thus, - the closing of the map. [NB: Phase numbering was forward at that time.] - - <p><small> -<!-- hhmts start --> -Last modified: Mon Feb 11 20:00:49 EST 2002 -<!-- hhmts end --> - </small> - </body> -</html> diff --git a/docs/comm/rts-libs/primitives.html b/docs/comm/rts-libs/primitives.html deleted file mode 100644 index 28abc79426..0000000000 --- a/docs/comm/rts-libs/primitives.html +++ /dev/null @@ -1,70 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - Primitives</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>The GHC Commentary - Primitives</h1> - <p> - Most user-level Haskell types and functions provided by GHC (in - particular those from the Prelude and GHC's Prelude extensions) are - internally constructed from even more elementary types and functions. - Most notably, GHC understands a notion of <em>unboxed types,</em> which - are the Haskell representation of primitive bit-level integer, float, - etc. types (as opposed to their boxed, heap allocated counterparts) - - cf. <a - href="http://research.microsoft.com/Users/simonpj/Papers/unboxed-values.ps.Z">"Unboxed - Values as First Class Citizens."</a> - - <h4>The Ultimate Source of Primitives</h4> - <p> - The hardwired types of GHC are brought into scope by the module - <code>PrelGHC</code>. This modules only exists in the form of a - handwritten interface file <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelGHC.hi-boot"><code>PrelGHC.hi-boot</code>,</a> - which lists the type and function names, as well as instance - declarations. The actually types of these names as well as their - implementation is hardwired into GHC. Note that the names in this file - are z-encoded, and in particular, identifiers ending on <code>zh</code> - denote user-level identifiers ending in a hash mark (<code>#</code>), - which is used to flag unboxed values or functions operating on unboxed - values. For example, we have <code>Char#</code>, <code>ord#</code>, and - so on. - - <h4>The New Primitive Definition Scheme</h4> - <p> - As of (about) the development version 4.11, the types and various - properties of primitive operations are defined in the file <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/primops.txt.pp"><code>primops.txt.pp</code></a>. - (Personally, I don't think that the <code>.txt</code> suffix is really - appropriate, as the file is used for automatic code generation; the - recent addition of <code>.pp</code> means that the file is now mangled - by cpp.) - <p> - The utility <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/utils/genprimopcode/"><code>genprimopcode</code></a> - generates a series of Haskell files from <code>primops.txt</code>, which - encode the types and various properties of the primitive operations as - compiler internal data structures. These Haskell files are not complete - modules, but program fragments, which are included into compiler modules - during the GHC build process. The generated include files can be found - in the directory <code>fptools/ghc/compiler/</code> and carry names - matching the pattern <code>primop-*.hs-incl</code>. They are generate - during the execution of the <code>boot</code> target in the - <code>fptools/ghc/</code> directory. This scheme significantly - simplifies the maintenance of primitive operations. - <p> - As of development version 5.02, the <code>primops.txt</code> file also allows the - recording of documentation about intended semantics of the primitives. This can - be extracted into a latex document (or rather, into latex document fragments) - via an appropriate switch to <code>genprimopcode</code>. In particular, see <code>primops.txt</code> - for full details of how GHC is configured to cope with different machine word sizes. - <p><small> -<!-- hhmts start --> -Last modified: Mon Nov 26 18:03:16 EST 2001 -<!-- hhmts end --> - </small> - </body> -</html> diff --git a/docs/comm/rts-libs/stgc.html b/docs/comm/rts-libs/stgc.html deleted file mode 100644 index 196ec9150d..0000000000 --- a/docs/comm/rts-libs/stgc.html +++ /dev/null @@ -1,45 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - Spineless Tagless C</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>The GHC Commentary - Spineless Tagless C</h1> - <p> - The C code generated by GHC doesn't use higher-level features of C to be - able to control as precisely as possible what code is generated. - Moreover, it uses special features of gcc (such as, first class labels) - to produce more efficient code. - <p> - STG C makes ample use of C's macro language to define idioms, which also - reduces the size of the generated C code (thus, reducing I/O times). - These macros are defined in the C headers located in GHC's <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/includes/"><code>includes</code></a> - directory. - - <h4><code>TailCalls.h</code></h4> - <p> - <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/includes/TailCalls.h"><code>TailCalls.h</code></a> - defines how tail calls are implemented - and in particular - optimised - in GHC generated code. The default case, for an architecture for which - GHC is not optimised, is to use the mini interpreter described in the <a - href="http://research.microsoft.com/copyright/accept.asp?path=/users/simonpj/papers/spineless-tagless-gmachine.ps.gz&pub=34">STG paper.</a> - <p> - For supported architectures, various tricks are used to generate - assembler implementing proper tail calls. On i386, gcc's first class - labels are used to directly jump to a function pointer. Furthermore, - markers of the form <code>--- BEGIN ---</code> and <code>--- END - ---</code> are added to the assembly right after the function prologue - and before the epilogue. These markers are used by <a - href="../the-beast/mangler.html">the Evil Mangler.</a> - - <p><small> -<!-- hhmts start --> -Last modified: Wed Aug 8 19:28:29 EST 2001 -<!-- hhmts end --> - </small> - </body> -</html> diff --git a/docs/comm/rts-libs/threaded-rts.html b/docs/comm/rts-libs/threaded-rts.html deleted file mode 100644 index 739dc8d58a..0000000000 --- a/docs/comm/rts-libs/threaded-rts.html +++ /dev/null @@ -1,126 +0,0 @@ -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - The Multi-threaded runtime, and multiprocessor execution</title> - </head> - - <body> - <h1>The GHC Commentary - The Multi-threaded runtime, and multiprocessor execution</h1> - - <p>This section of the commentary explains the structure of the runtime system - when used in threaded or SMP mode.</p> - - <p>The <em>threaded</em> version of the runtime supports - bound threads and non-blocking foreign calls, and an overview of its - design can be found in the paper <a - href="http://www.haskell.org/~simonmar/papers/conc-ffi.pdf">Extending - the Haskell Foreign Function Interface with Concurrency</a>. To - compile the runtime with threaded support, add the line - -<pre>GhcRTSWays += thr</pre> - - to <tt>mk/build.mk</tt>. When building C code in the runtime for the threaded way, - the symbol <tt>THREADED_RTS</tt> is defined (this is arranged by the - build system when building for way <tt>thr</tt>, see - <tt>mk/config.mk</tt>). To build a Haskell program - with the threaded runtime, pass the flag <tt>-threaded</tt> to GHC (this - can be used in conjunction with <tt>-prof</tt>, and possibly - <tt>-debug</tt> and others depending on which versions of the RTS have - been built.</p> - - <p>The <em>SMP</em> version runtime supports the same facilities as the - threaded version, and in addition supports execution of Haskell code by - multiple simultaneous OS threads. For SMP support, both the runtime and - the libraries must be built a special way: add the lines - - <pre> -GhcRTSWays += thr -GhcLibWays += s</pre> - - to <tt>mk/build.mk</tt>. To build Haskell code for - SMP execution, use the flag <tt>-smp</tt> to GHC (this can be used in - conjunction with <tt>-debug</tt>, but no other way-flags at this time). - When building C code in the runtime for SMP - support, the symbol <tt>SMP</tt> is defined (this is arranged by the - compiler when the <tt>-smp</tt> flag is given, see - <tt>ghc/compiler/main/StaticFlags.hs</tt>).</p> - - <p>When building the runtime in either the threaded or SMP ways, the symbol - <tt>RTS_SUPPORTS_THREADS</tt> will be defined (see <tt>Rts.h</tt>).</p> - - <h2>Overall design</h2> - - <p>The system is based around the notion of a <tt>Capability</tt>. A - <tt>Capability</tt> is an object that represents both the permission to - execute some Haskell code, and the state required to do so. In order - to execute some Haskell code, a thread must therefore hold a - <tt>Capability</tt>. The available pool of capabilities is managed by - the <tt>Capability</tt> API, described below.</p> - - <p>In the threaded runtime, there is only a single <tt>Capability</tt> in the - system, indicating that only a single thread can be executing Haskell - code at any one time. In the SMP runtime, there can be an arbitrary - number of capabilities selectable at runtime with the <tt>+RTS -N<em>n</em></tt> - flag; in practice the number is best chosen to be the same as the number of - processors on the host machine.</p> - - <p>There are a number of OS threads running code in the runtime. We call - these <em>tasks</em> to avoid confusion with Haskell <em>threads</em>. - Tasks are managed by the <tt>Task</tt> subsystem, which is mainly - concerned with keeping track of statistics such as how much time each - task spends executing Haskell code, and also keeping track of how many - tasks are around when we want to shut down the runtime.</p> - - <p>Some tasks are created by the runtime itself, and some may be here - as a result of a call to Haskell from foreign code (we - call this an in-call). The - runtime can support any number of concurrent foreign in-calls, but the - number of these calls that will actually run Haskell code in parallel is - determined by the number of available capabilities. Each in-call creates - a <em>bound thread</em>, as described in the FFI/Concurrency paper (cited - above).</p> - - <p>In the future we may want to bind a <tt>Capability</tt> to a particular - processor, so that we can support a notion of affinity - avoiding - accidental migration of work from one CPU to another, so that we can make - best use of a CPU's local cache. For now, the design ignores this - issue.</p> - - <h2>The <tt>OSThreads</tt> interface</h2> - - <p>This interface is merely an abstraction layer over the OS-specific APIs - for managing threads. It has two main implementations: Win32 and - POSIX.</p> - - <p>This is the entirety of the interface:</p> - -<pre> -/* Various abstract types */ -typedef Mutex; -typedef Condition; -typedef OSThreadId; - -extern OSThreadId osThreadId ( void ); -extern void shutdownThread ( void ); -extern void yieldThread ( void ); -extern int createOSThread ( OSThreadId* tid, - void (*startProc)(void) ); - -extern void initCondition ( Condition* pCond ); -extern void closeCondition ( Condition* pCond ); -extern rtsBool broadcastCondition ( Condition* pCond ); -extern rtsBool signalCondition ( Condition* pCond ); -extern rtsBool waitCondition ( Condition* pCond, - Mutex* pMut ); - -extern void initMutex ( Mutex* pMut ); - </pre> - - <h2>The Task interface</h2> - - <h2>The Capability interface</h2> - - <h2>Multiprocessor Haskell Execution</h2> - - </body> -</html> diff --git a/docs/comm/the-beast/alien.html b/docs/comm/the-beast/alien.html deleted file mode 100644 index 3d4776ebc9..0000000000 --- a/docs/comm/the-beast/alien.html +++ /dev/null @@ -1,56 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - Alien Functions</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>The GHC Commentary - Alien Functions</h1> - <p> - GHC implements experimental (by now it is actually quite well tested) - support for access to foreign functions and generally the interaction - between Haskell code and code written in other languages. Code - generation in this context can get quite tricky. This section attempts - to cast some light on this aspect of the compiler. - - <h4>FFI Stub Files</h4> - <p> - For each Haskell module that contains a <code>foreign export - dynamic</code> declaration, GHC generates a <code>_stub.c</code> file - that needs to be linked with any program that imports the Haskell - module. When asked about it <a - href="mailto:simonmar@microsoft.com">Simon Marlow</a> justified the - existence of these files as follows: - <blockquote> - The stub files contain the helper function which invokes the Haskell - code when called from C. - <p> - Each time the foreign export dynamic is invoked to create a new - callback function, a small piece of code has to be dynamically - generated (by code in <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/rts/Adjustor.c"><code>Adjustor.c</code></a>). It is the address of this dynamically generated bit of - code that is returned as the <code>Addr</code> (or <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/hslibs/lang/Ptr.lhs"><code>Ptr</code></a>). - When called from C, the dynamically generated code must somehow invoke - the Haskell function which was originally passed to the - f.e.d. function -- it does this by invoking the helper function, - passing it a <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/hslibs/lang/StablePtr.lhs"><code>StablePtr</code></a> - to the Haskell function. It's split this way for two reasons: the - same helper function can be used each time the f.e.d. function is - called, and to keep the amount of dynamically generated code to a - minimum. - </blockquote> - <p> - The stub code is generated by <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsForeign.lhs"><code>DSForeign</code></a><code>.fexportEntry</code>. - - - <p><small> -<!-- hhmts start --> -Last modified: Fri Aug 10 11:47:41 EST 2001 -<!-- hhmts end --> - </small> - </body> -</html> diff --git a/docs/comm/the-beast/basicTypes.html b/docs/comm/the-beast/basicTypes.html deleted file mode 100644 index b411e4c5a9..0000000000 --- a/docs/comm/the-beast/basicTypes.html +++ /dev/null @@ -1,132 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - The Basics</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>The GHC Commentary - The Basics</h1> - <p> - The directory <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/"><code>fptools/ghc/compiler/basicTypes/</code></a> - contains modules that define some of the essential types definition for - the compiler - such as, identifiers, variables, modules, and unique - names. Some of those are discussed in the following. See elsewhere for more - detailed information on: - <ul> - <li> <a href="vars.html"><code>Var</code>s, <code>Id</code>s, and <code>TyVar</code>s</a> - <li> <a href="renamer.html"><code>OccName</code>s, <code>RdrName</code>s, and <code>Names</code>s</a> - </ul> - - <h2>Elementary Types</h2> - - <h4><code>Id</code>s</h4> - <p> - An <code>Id</code> (defined in <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/Id.lhs"><code>Id.lhs</code></a> - essentially records information about value and data constructor - identifiers -- to be precise, in the case of data constructors, two - <code>Id</code>s are used to represent the worker and wrapper functions - for the data constructor, respectively. The information maintained in - the <code>Id</code> abstraction includes among other items strictness, - occurrence, specialisation, and unfolding information. - <p> - Due to the way <code>Id</code>s are used for data constructors, - all <code>Id</code>s are represented as variables, which contain a - <code>varInfo</code> field of abstract type <code><a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/IdInfo.lhs">IdInfo</a>.IdInfo</code>. - This is where the information about <code>Id</code>s is really stored. - The following is a (currently, partial) list of the various items in an - <code>IdInfo</code>: - <p> - <dl> - <dt><a name="occInfo">Occurrence information</a> - <dd>The <code>OccInfo</code> data type is defined in the module <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/BasicTypes.lhs"><code>BasicTypes.lhs</code></a>. - Apart from the trivial <code>NoOccInfo</code>, it distinguishes - between variables that do not occur at all (<code>IAmDead</code>), - occur just once (<code>OneOcc</code>), or a <a - href="simplifier.html#loopBreaker">loop breakers</a> - (<code>IAmALoopBreaker</code>). - </dl> - - <h2>Sets, Finite Maps, and Environments</h2> - <p> - Sets of variables, or more generally names, which are needed throughout - the compiler, are provided by the modules <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/VarSet.lhs"><code>VarSet.lhs</code></a> - and <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/NameSet.lhs"><code>NameSet.lhs</code></a>, - respectively. Moreover, frequently maps from variables (or names) to - other data is needed. For example, a substitution is represented by a - finite map from variable names to expressions. Jobs like this are - solved by means of variable and name environments implemented by the - modules <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/VarEnv.lhs"><code>VarEnv.lhs</code></a> - and <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/NameEnv.lhs"><code>NameEnv.lhs</code></a>. - - <h4>The Module <code>VarSet</code></h4> - <p> - The Module <code>VarSet</code> provides the types <code>VarSet</code>, - <code>IdSet</code>, and <code>TyVarSet</code>, which are synonyms in the - current implementation, as <code>Var</code>, <code>Id</code>, and - <code>TyVar</code> are synonyms. The module provides all the operations - that one would expect including the creating of sets from individual - variables and lists of variables, union and intersection operations, - element checks, deletion, filter, fold, and map functions. - <p> - The implementation is based on <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/utils/UniqSet.lhs"><code>UniqSet</code></a>s, - which in turn are simply <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/utils/UniqFM.lhs"><code>UniqFM</code></a>s - (i.e., finite maps with uniques as keys) that map each unique to the - variable that it represents. - - <h4>The Module <code>NameSet</code></h4> - <p> - The Module <code>NameSet</code> provides the same functionality as - <code>VarSet</code> only for <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/Name.lhs"><code>Name</code></a>s. - As for the difference between <code>Name</code>s and <code>Var</code>s, - a <code>Var</code> is built from a <code>Name</code> plus additional - information (mostly importantly type information). - - <h4>The Module <code>VarEnv</code></h4> - <p> - The module <code>VarEnv</code> provides the types <code>VarEnv</code>, - <code>IdEnv</code>, and <code>TyVarEnv</code>, which are again - synonyms. The provided base functionality is similar to - <code>VarSet</code> with the main difference that a type <code>VarEnv - T</code> associates a value of type <code>T</code> with each variable in - the environment, thus effectively implementing a finite map from - variables to values of type <code>T</code>. - <p> - The implementation of <code>VarEnv</code> is also by <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/utils/UniqFM.lhs"><code>UniqFM</code></a>, - which entails the slightly surprising implication that it is - <em>not</em> possible to retrieve the domain of a variable environment. - In other words, there is no function corresponding to - <code>VarSet.varSetElems :: VarSet -> [Var]</code> in - <code>VarEnv</code>. This is because the <code>UniqFM</code> used to - implement <code>VarEnv</code> stores only the unique corresponding to a - variable in the environment, but not the entire variable (and there is - no mapping from uniques to variables). - <p> - In addition to plain variable environments, the module also contains - special substitution environments - the type <code>SubstEnv</code> - - that associates variables with a special purpose type - <code>SubstResult</code>. - - <h4>The Module <code>NameEnv</code></h4> - <p> - The type <code>NameEnv.NameEnv</code> is like <code>VarEnv</code> only - for <code>Name</code>s. - - <p><hr><small> -<!-- hhmts start --> -Last modified: Tue Jan 8 18:29:52 EST 2002 -<!-- hhmts end --> - </small> - </body> -</html> diff --git a/docs/comm/the-beast/coding-style.html b/docs/comm/the-beast/coding-style.html deleted file mode 100644 index 41347c6902..0000000000 --- a/docs/comm/the-beast/coding-style.html +++ /dev/null @@ -1,230 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - Coding Style Guidelines</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>The GHC Commentary - Coding Style Guidelines</h1> - - <p>This is a rough description of some of the coding practices and - style that we use for Haskell code inside <tt>ghc/compiler</tt>. - - <p>The general rule is to stick to the same coding style as is - already used in the file you're editing. If you must make - stylistic changes, commit them separately from functional changes, - so that someone looking back through the change logs can easily - distinguish them. - - <h2>To literate or not to literate?</h2> - - <p>In GHC we use a mixture of literate (<tt>.lhs</tt>) and - non-literate (<tt>.hs</tt>) source. I (Simon M.) prefer to use - non-literate style, because I think the - <tt>\begin{code}..\end{code}</tt> clutter up the source too much, - and I like to use Haddock-style comments (we haven't tried - processing the whole of GHC with Haddock yet, though). - - <h2>To CPP or not to CPP?</h2> - - <p>We pass all the compiler sources through CPP. The - <tt>-cpp</tt> flag is always added by the build system. - - <p>The following CPP symbols are used throughout the compiler: - - <dl> - <dt><tt>DEBUG</tt></dt> - - <dd>Used to enables extra checks and debugging output in the - compiler. The <tt>ASSERT</tt> macro (see <tt>HsVersions.h</tt>) - provides assertions which disappear when <tt>DEBUG</tt> is not - defined. - - <p>All debugging output should be placed inside <tt>#ifdef - DEBUG</tt>; we generally use this to provide warnings about - strange cases and things that might warrant investigation. When - <tt>DEBUG</tt> is off, the compiler should normally be silent - unless something goes wrong (exception when the verbosity level - is greater than zero). - - <p>A good rule of thumb is that <tt>DEBUG</tt> shouldn't add - more than about 10-20% to the compilation time. This is the case - at the moment. If it gets too expensive, we won't use it. For - more expensive runtime checks, consider adding a flag - see for - example <tt>-dcore-lint</tt>. - </dd> - - <dt><tt>GHCI</tt></dt> - - <dd>Enables GHCi support, including the byte code generator and - interactive user interface. This isn't the default, because the - compiler needs to be bootstrapped with itself in order for GHCi - to work properly. The reason is that the byte-code compiler and - linker are quite closely tied to the runtime system, so it is - essential that GHCi is linked with the most up-to-date RTS. - Another reason is that the representation of certain datatypes - must be consistent between GHCi and its libraries, and if these - were inconsistent then disaster could follow. - </dd> - - </dl> - - <h2>Platform tests</h2> - - <p>There are three platforms of interest to GHC: - - <ul> - <li>The <b>Build</b> platform. This is the platform on which we - are building GHC.</li> - <li>The <b>Host</b> platform. This is the platform on which we - are going to run this GHC binary, and associated tools.</li> - <li>The <b>Target</b> platform. This is the platform for which - this GHC binary will generate code.</li> - </ul> - - <p>At the moment, there is very limited support for having - different values for buil, host, and target. In particular:</p> - - <ul> - <li>The build platform is currently always the same as the host - platform. The build process needs to use some of the tools in - the source tree, for example <tt>ghc-pkg</tt> and - <tt>hsc2hs</tt>.</li> - - <li>If the target platform differs from the host platform, then - this is generally for the purpose of building <tt>.hc</tt> files - from Haskell source for porting GHC to the target platform. - Full cross-compilation isn't supported (yet).</li> - </ul> - - <p>In the compiler's source code, you may make use of the - following CPP symbols:</p> - - <ul> - <li><em>xxx</em><tt>_TARGET_ARCH</tt></li> - <li><em>xxx</em><tt>_TARGET_VENDOR</tt></li> - <li><em>xxx</em><tt>_TARGET_OS</tt></li> - <li><em>xxx</em><tt>_HOST_ARCH</tt></li> - <li><em>xxx</em><tt>_HOST_VENDOR</tt></li> - <li><em>xxx</em><tt>_HOST_OS</tt></li> - </ul> - - <p>where <em>xxx</em> is the appropriate value: - eg. <tt>i386_TARGET_ARCH</tt>. - - <h2>Compiler versions</h2> - - <p>GHC must be compilable by every major version of GHC from 5.02 - onwards, and itself. It isn't necessary for it to be compilable - by every intermediate development version (that includes last - week's CVS sources). - - <p>To maintain compatibility, use <tt>HsVersions.h</tt> (see - below) where possible, and try to avoid using <tt>#ifdef</tt> in - the source itself. - - <h2>The source file</h2> - - <p>We now describe a typical source file, annotating stylistic - choices as we go. - -<pre> -{-# OPTIONS ... #-} -</pre> - - <p>An <tt>OPTIONS</tt> pragma is optional, but if present it - should go right at the top of the file. Things you might want to - put in <tt>OPTIONS</tt> include: - - <ul> - <li><tt>-#include</tt> options to bring into scope prototypes - for FFI declarations</li> - <li><tt>-fvia-C</tt> if you know that - this module won't compile with the native code generator. - </ul> - - <p>Don't bother putting <tt>-cpp</tt> or <tt>-fglasgow-exts</tt> - in the <tt>OPTIONS</tt> pragma; these are already added to the - command line by the build system. - - -<pre> -module Foo ( - T(..), - foo, -- :: T -> T - ) where -</pre> - - <p>We usually (99% of the time) include an export list. The only - exceptions are perhaps where the export list would list absolutely - everything in the module, and even then sometimes we do it anyway. - - <p>It's helpful to give type signatures inside comments in the - export list, but hard to keep them consistent, so we don't always - do that. - -<pre> -#include "HsVersions.h" -</pre> - - <p><tt>HsVersions.h</tt> is a CPP header file containing a number - of macros that help smooth out the differences between compiler - versions. It defines, for example, macros for library module - names which have moved between versions. Take a look. - -<pre> --- friends -import SimplMonad - --- GHC -import CoreSyn -import Id ( idName, idType ) -import BasicTypes - --- libraries -import DATA_IOREF ( newIORef, readIORef ) - --- std -import List ( partition ) -import Maybe ( fromJust ) -</pre> - - <p>List imports in the following order: - - <ul> - <li>Local to this subsystem (or directory) first</li> - - <li>Compiler imports, generally ordered from specific to generic - (ie. modules from <tt>utils/</tt> and <tt>basicTypes/</tt> - usually come last)</li> - - <li>Library imports</li> - - <li>Standard Haskell 98 imports last</li> - </ul> - - <p>Import library modules from the <tt>base</tt> and - <tt>haskell98</tt> packages only. Use <tt>#defines</tt> in - <tt>HsVersions.h</tt> when the modules names differ between - versions of GHC (eg. <tt>DATA_IOREF</tt> in the example above). - For code inside <tt>#ifdef GHCI</tt>, don't need to worry about GHC - versioning (because we are bootstrapped). - - <p>We usually use import specs to give an explicit list of the - entities imported from a module. The main reason for doing this is - so that you can search the file for an entity and find which module - it comes from. However, huge import lists can be a pain to - maintain, so we often omit the import specs when they start to get - long (actually I start omitting them when they don't fit on one - line --Simon M.). Tip: use GHC's <tt>-fwarn-unused-imports</tt> - flag so that you get notified when an import isn't being used any - more. - - <p>If the module can be compiled multiple ways (eg. GHCI - vs. non-GHCI), make sure the imports are properly <tt>#ifdefed</tt> - too, so as to avoid spurious unused import warnings. - - <p><em>ToDo: finish this</em> - </body> -</html> diff --git a/docs/comm/the-beast/data-types.html b/docs/comm/the-beast/data-types.html deleted file mode 100644 index 4ec220c937..0000000000 --- a/docs/comm/the-beast/data-types.html +++ /dev/null @@ -1,242 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - Data types and data constructors</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>The GHC Commentary - Data types and data constructors</h1> - <p> - -This chapter was thoroughly changed Feb 2003. - -<h2>Data types</h2> - -Consider the following data type declaration: - -<pre> - data T a = MkT !(a,a) !(T a) | Nil - - f x = case x of - MkT p q -> MkT p (q+1) - Nil -> Nil -</pre> -The user's source program mentions only the constructors <tt>MkT</tt> -and <tt>Nil</tt>. However, these constructors actually <em>do</em> something -in addition to building a data value. For a start, <tt>MkT</tt> evaluates -its arguments. Secondly, with the flag <tt>-funbox-strict-fields</tt> GHC -will flatten (or unbox) the strict fields. So we may imagine that there's the -<em>source</em> constructor <tt>MkT</tt> and the <em>representation</em> constructor -<tt>MkT</tt>, and things start to get pretty confusing. -<p> -GHC now generates three unique <tt>Name</tt>s for each data constructor: -<pre> - ---- OccName ------ - String Name space Used for - --------------------------------------------------------------------------- - The "source data con" MkT DataName The DataCon itself - The "worker data con" MkT VarName Its worker Id - aka "representation data con" - The "wrapper data con" $WMkT VarName Its wrapper Id (optional) -</pre> -Recall that each occurrence name (OccName) is a pair of a string and a -name space (see <a href="names.html">The truth about names</a>), and -two OccNames are considered the same only if both components match. -That is what distinguishes the name of the name of the DataCon from -the name of its worker Id. To keep things unambiguous, in what -follows we'll write "MkT{d}" for the source data con, and "MkT{v}" for -the worker Id. (Indeed, when you dump stuff with "-ddumpXXX", if you -also add "-dppr-debug" you'll get stuff like "Foo {- d rMv -}". The -"d" part is the name space; the "rMv" is the unique key.) -<p> -Each of these three names gets a distinct unique key in GHC's name cache. - -<h2>The life cycle of a data type</h2> - -Suppose the Haskell source looks like this: -<pre> - data T a = MkT !(a,a) !Int | Nil - - f x = case x of - Nil -> Nil - MkT p q -> MkT p (q+1) -</pre> -When the parser reads it in, it decides which name space each lexeme comes -from, thus: -<pre> - data T a = MkT{d} !(a,a) !Int | Nil{d} - - f x = case x of - Nil{d} -> Nil{d} - MkT{d} p q -> MkT{d} p (q+1) -</pre> -Notice that in the Haskell source <em>all data contructors are named via the "source data con" MkT{d}</em>, -whether in pattern matching or in expressions. -<p> -In the translated source produced by the type checker (-ddump-tc), the program looks like this: -<pre> - f x = case x of - Nil{d} -> Nil{v} - MkT{d} p q -> $WMkT p (q+1) - -</pre> -Notice that the type checker replaces the occurrence of MkT by the <em>wrapper</em>, but -the occurrence of Nil by the <em>worker</em>. Reason: Nil doesn't have a wrapper because there is -nothing to do in the wrapper (this is the vastly common case). -<p> -Though they are not printed out by "-ddump-tc", behind the scenes, there are -also the following: the data type declaration and the wrapper function for MkT. -<pre> - data T a = MkT{d} a a Int# | Nil{d} - - $WMkT :: (a,a) -> T a -> T a - $WMkT p t = case p of - (a,b) -> seq t (MkT{v} a b t) -</pre> -Here, the <em>wrapper</em> <tt>$WMkT</tt> evaluates and takes apart the argument <tt>p</tt>, -evaluates the argument <tt>t</tt>, and builds a three-field data value -with the <em>worker</em> constructor <tt>MkT{v}</tt>. (There are more notes below -about the unboxing of strict fields.) The worker $WMkT is called an <em>implicit binding</em>, -because it's introduced implicitly by the data type declaration (record selectors -are also implicit bindings, for example). Implicit bindings are injected into the code -just before emitting code or External Core. -<p> -After desugaring into Core (-ddump-ds), the definition of <tt>f</tt> looks like this: -<pre> - f x = case x of - Nil{d} -> Nil{v} - MkT{d} a b r -> let { p = (a,b); q = I# r } in - $WMkT p (q+1) -</pre> -Notice the way that pattern matching has been desugared to take account of the fact -that the "real" data constructor MkT has three fields. -<p> -By the time the simplifier has had a go at it, <tt>f</tt> will be transformed to: -<pre> - f x = case x of - Nil{d} -> Nil{v} - MkT{d} a b r -> MkT{v} a b (r +# 1#) -</pre> -Which is highly cool. - - -<h2> The constructor wrapper functions </h2> - -The wrapper functions are automatically generated by GHC, and are -really emitted into the result code (albeit only after CorePre; see -<tt>CorePrep.mkImplicitBinds</tt>). -The wrapper functions are inlined very -vigorously, so you will not see many occurrences of the wrapper -functions in an optimised program, but you may see some. For example, -if your Haskell source has -<pre> - map MkT xs -</pre> -then <tt>$WMkT</tt> will not be inlined (because it is not applied to anything). -That is why we generate real top-level bindings for the wrapper functions, -and generate code for them. - - -<h2> The constructor worker functions </h2> - -Saturated applications of the constructor worker function MkT{v} are -treated specially by the code generator; they really do allocation. -However, we do want a single, shared, top-level definition for -top-level nullary constructors (like True and False). Furthermore, -what if the code generator encounters a non-saturated application of a -worker? E.g. <tt>(map Just xs)</tt>. We could declare that to be an -error (CorePrep should saturate them). But instead we currently -generate a top-level definition for each constructor worker, whether -nullary or not. It takes the form: -<pre> - MkT{v} = \ p q r -> MkT{v} p q r -</pre> -This is a real hack. The occurrence on the RHS is saturated, so the code generator (both the -one that generates abstract C and the byte-code generator) treats it as a special case and -allocates a MkT; it does not make a recursive call! So now there's a top-level curried -version of the worker which is available to anyone who wants it. -<p> -This strange definition is not emitted into External Core. Indeed, you might argue that -we should instead pass the list of <tt>TyCon</tt>s to the code generator and have it -generate magic bindings directly. As it stands, it's a real hack: see the code in -CorePrep.mkImplicitBinds. - - -<h2> External Core </h2> - -When emitting External Core, we should see this for our running example: - -<pre> - data T a = MkT a a Int# | Nil{d} - - $WMkT :: (a,a) -> T a -> T a - $WMkT p t = case p of - (a,b) -> seq t (MkT a b t) - - f x = case x of - Nil -> Nil - MkT a b r -> MkT a b (r +# 1#) -</pre> -Notice that it makes perfect sense as a program all by itself. Constructors -look like constructors (albeit not identical to the original Haskell ones). -<p> -When reading in External Core, the parser is careful to read it back in just -as it was before it was spat out, namely: -<pre> - data T a = MkT{d} a a Int# | Nil{d} - - $WMkT :: (a,a) -> T a -> T a - $WMkT p t = case p of - (a,b) -> seq t (MkT{v} a b t) - - f x = case x of - Nil{d} -> Nil{v} - MkT{d} a b r -> MkT{v} a b (r +# 1#) -</pre> - - -<h2> Unboxing strict fields </h2> - -If GHC unboxes strict fields (as in the first argument of <tt>MkT</tt> above), -it also transforms -source-language case expressions. Suppose you write this in your Haskell source: -<pre> - case e of - MkT p t -> ..p..t.. -</pre> -GHC will desugar this to the following Core code: -<pre> - case e of - MkT a b t -> let p = (a,b) in ..p..t.. -</pre> -The local let-binding reboxes the pair because it may be mentioned in -the case alternative. This may well be a bad idea, which is why -<tt>-funbox-strict-fields</tt> is an experimental feature. -<p> -It's essential that when importing a type <tt>T</tt> defined in some -external module <tt>M</tt>, GHC knows what representation was used for -that type, and that in turn depends on whether module <tt>M</tt> was -compiled with <tt>-funbox-strict-fields</tt>. So when writing an -interface file, GHC therefore records with each data type whether its -strict fields (if any) should be unboxed. - -<h2> Labels and info tables </h2> - -<em>Quick rough notes: SLPJ March 2003</em>. -<p> -Every data constructor <tt>C</tt>has two info tables: -<ul> -<li> The static info table (label <tt>C_static_info</tt>), used for statically-allocated constructors. - -<li> The dynamic info table (label <tt>C_con_info</tt>), used for dynamically-allocated constructors. -</ul> -Statically-allocated constructors are not moved by the garbage collector, and therefore have a different closure -type from dynamically-allocated constructors; hence they need -a distinct info table. -Both info tables share the same entry code, but since the entry code is phyiscally juxtaposed with the -info table, it must be duplicated (<tt>C_static_entry</tt> and <tt>C_con_entry</tt> respectively). - - </body> -</html> - diff --git a/docs/comm/the-beast/desugar.html b/docs/comm/the-beast/desugar.html deleted file mode 100644 index a66740259b..0000000000 --- a/docs/comm/the-beast/desugar.html +++ /dev/null @@ -1,156 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - Sugar Free: From Haskell To Core</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>The GHC Commentary - Sugar Free: From Haskell To Core</h1> - <p> - Up until after type checking, GHC keeps the source program in an - abstract representation of Haskell source without removing any of the - syntactic sugar (such as, list comprehensions) that could easily be - represented by more primitive Haskell. This complicates part of the - front-end considerably as the abstract syntax of Haskell (as exported by - the module <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/hsSyn/HsSyn.lhs"><code>HsSyn</code></a>) - is much more complex than a simplified representation close to, say, the - <a href="http://haskell.org/onlinereport/intro.html#sect1.2">Haskell - Kernel</a> would be. However, having a representation that is as close - as possible to the surface syntax simplifies the generation of clear - error messages. As GHC (quite in contrast to "conventional" compilers) - prints code fragments as part of error messages, the choice of - representation is especially important. - <p> - Nonetheless, as soon as the input has passed all static checks, it is - transformed into GHC's principal intermediate language that goes by the - name of <em>Core</em> and whose representation is exported by the - module <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/coreSyn/CoreSyn.lhs"><code>CoreSyn</code></a>. - All following compiler phases, except code generation operate on Core. - Due to Andrew Tolmach's effort, there is also an <a - href="http://www.haskell.org/ghc/docs/papers/core.ps.gz">external - representation for Core.</a> - <p> - The conversion of the compiled module from <code>HsSyn</code> into that - of <code>CoreSyn</code> is performed by a phase called the - <em>desugarer</em>, which is located in - <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/"><code>fptools/ghc/compiler/deSugar/</code></a>. - It's operation is detailed in the following. - </p> - - <h2>Auxilliary Functions</h2> - <p> - The modules <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsMonad.lhs"><code>DsMonad</code></a> - defines the desugarer monad (of type <code>DsM</code>) which maintains - the environment needed for desugaring. In particular, it encapsulates a - unique supply for generating new variables, a map to lookup standard - names (such as functions from the prelude), a source location for error - messages, and a pool to collect warning messages generated during - desugaring. Initialisation of the environment happens in the function <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Desugar.lhs"><code>Desugar</code></a><code>.desugar</code>, - which is also the main entry point into the desugarer. - <p> - The generation of Core code often involves the use of standard functions - for which proper identifiers (i.e., values of type <code>Id</code> that - actually refer to the definition in the right Prelude) need to be - obtained. This is supported by the function - <code>DsMonad.dsLookupGlobalValue :: Name -> DsM Id</code>. - - <h2><a name="patmat">Pattern Matching</a></h2> - <p> - Nested pattern matching with guards and everything is translated into - the simple, flat case expressions of Core by the following modules: - <dl> - <dt><a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Match.lhs"><code>Match</code></a>: - <dd>This modules contains the main pattern-matching compiler in the form - of a function called <code>match</code>. There is some documentation - as to how <code>match</code> works contained in the module itself. - Generally, the implemented algorithm is similar to the one described - in Phil Wadler's Chapter ? of Simon Peyton Jones' <em>The - Implementation of Functional Programming Languages</em>. - <code>Match</code> exports a couple of functions with not really - intuitive names. In particular, it exports <code>match</code>, - <code>matchWrapper</code>, <code>matchExport</code>, and - <code>matchSimply</code>. The function <code>match</code>, which is - the main work horse, is only used by the other matching modules. The - function <code>matchExport</code> - despite it's name - is merely used - internally in <code>Match</code> and handles warning messages (see - below for more details). The actual interface to the outside is - <code>matchWrapper</code>, which converts the output of the type - checker into the form needed by the pattern matching compiler (i.e., a - list of <code>EquationInfo</code>). Similar in function to - <code>matchWrapper</code> is <code>matchSimply</code>, which provides - an interface for the case where a single expression is to be matched - against a single pattern (as, for example, is the case in bindings in - a <code>do</code> expression). - <dt><a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/MatchCon.lhs"><code>MatchCon</code></a>: - <dd>This module generates code for a set of alternative constructor - patterns that belong to a single type by means of the routine - <code>matchConFamily</code>. More precisely, the routine gets a set - of equations where the left-most pattern of each equation is a - constructor pattern with a head symbol from the same type as that of - all the other equations. A Core case expression is generated that - distinguihes between all these constructors. The routine is clever - enough to generate a sparse case expression and to add a catch-all - default case only when needed (i.e., if the case expression isn't - exhaustive already). There is also an explanation at the start of the - modules. - <dt><a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/MatchLit.lhs"><code>MatchLit</code></a>: - <dd>Generates code for a set of alternative literal patterns by means of - the routine <code>matchLiterals</code>. The principle is similar to - that of <code>matchConFamily</code>, but all left-most patterns are - literals of the same type. - <dt><a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsUtils.lhs"><code>DsUtils</code></a>: - <dd>This module provides a set of auxilliary definitions as well as the - data types <code>EquationInfo</code> and <code>MatchResult</code> that - form the input and output, respectively, of the pattern matching - compiler. - <dt><a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Check.lhs"><code>Check</code></a>: - <dd>This module does not really contribute the compiling pattern - matching, but it inspects sets of equations to find whether there are - any overlapping patterns or non-exhaustive pattern sets. This task is - implemented by the function <code>check</code>, which returns a list of - patterns that are part of a non-exhaustive case distinction as well as a - set of equation labels that can be reached during execution of the code; - thus, the remaining equations are shadowed due to overlapping patterns. - The function <code>check</code> is invoked and its result converted into - suitable warning messages by the function <code>Match.matchExport</code> - (which is a wrapper for <code>Match.match</code>). - </dl> - <p> - The central function <code>match</code>, given a set of equations, - proceeds in a number of steps: - <ol> - <li>It starts by desugaring the left-most pattern of each equation using - the function <code>tidy1</code> (indirectly via - <code>tidyEqnInfo</code>). During this process, non-elementary - pattern (e.g., those using explicit list syntax <code>[x, y, ..., - z]</code>) are converted to a standard constructor pattern and also - irrefutable pattern are removed. - <li>Then, a process called <em>unmixing</em> clusters the equations into - blocks (without re-ordering them), such that the left-most pattern of - all equations in a block are either all variables, all literals, or - all constructors. - <li>Each block is, then, compiled by <code>matchUnmixedEqns</code>, - which forwards the handling of literal pattern blocks to - <code>MatchLit.matchLiterals</code>, of constructor pattern blocks to - <code>MatchCon.matchConFamily</code>, and hands variable pattern - blocks back to <code>match</code>. - </ol> - - <p><hr><small> -<!-- hhmts start --> -Last modified: Mon Feb 11 22:35:25 EST 2002 -<!-- hhmts end --> - </small> - </body> -</html> diff --git a/docs/comm/the-beast/driver.html b/docs/comm/the-beast/driver.html deleted file mode 100644 index fbf65e33e7..0000000000 --- a/docs/comm/the-beast/driver.html +++ /dev/null @@ -1,179 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - The Glorious Driver</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>The GHC Commentary - The Glorious Driver</h1> - <p> - The Glorious Driver (GD) is the part of GHC that orchestrates the - interaction of all the other pieces that make up GHC. It supersedes the - <em>Evil Driver (ED),</em> which was a Perl script that served the same - purpose and was in use until version 4.08.1 of GHC. Simon Marlow - eventually slayed the ED and instated the GD. The GD is usually called - the <em>Compilation Manager</em> these days. - </p> - <p> - The GD has been substantially extended for GHCi, i.e., the interactive - variant of GHC that integrates the compiler with a (meta-circular) - interpreter since version 5.00. Most of the driver is located in the - directory - <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/"><code>fptools/ghc/compiler/main/</code></a>. - </p> - - <h2>Command Line Options</h2> - <p> - GHC's many flavours of command line options make the code interpreting - them rather involved. The following provides a brief overview of the - processing of these options. Since the addition of the interactive - front-end to GHC, there are two kinds of options: <em>static - options</em> and <em>dynamic options.</em> The former can only be set - when the system is invoked, whereas the latter can be altered in the - course of an interactive session. A brief explanation on the difference - between these options and related matters is at the start of the module - <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/CmdLineOpts.lhs"><code>CmdLineOpts</code></a>. - The same module defines the enumeration <code>DynFlag</code>, which - contains all dynamic flags. Moreover, there is the labelled record - <code>DynFlags</code> that collects all the flag-related information - that is passed by the compilation manager to the compiler proper, - <code>hsc</code>, whenever a compilation is triggered. If you like to - find out whether an option is static, use the predicate - <code>isStaticHscFlag</code> in the same module. - <p> - The second module that contains a lot of code related to the management - of flags is <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/DriverFlags.hs"><code>DriverFlags.hs</code></a>. - In particular, the module contains two association lists that map the - textual representation of the various flags to a data structure that - tells the driver how to parse the flag (e.g., whether it has any - arguments) and provides its internal representation. All static flags - are contained in <code>static_flags</code>. A whole range of - <code>-f</code> flags can be negated by adding a <code>-f-no-</code> - prefix. These flags are contained in the association list - <code>fFlags</code>. - <p> - The driver uses a nasty hack based on <code>IORef</code>s that permits - the rest of the compiler to access static flags as CAFs; i.e., there is - a family of toplevel variable definitions in - <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/CmdLineOpts.lhs"><code>CmdLineOpts</code></a>, - below the literate section heading <i>Static options</i>, each of which - contains the value of one static option. This is essentially realised - via global variables (in the sense of C-style, updatable, global - variables) defined via an evil pre-processor macro named - <code>GLOBAL_VAR</code>, which is defined in a particularly ugly corner - of GHC, namely the C header file - <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/HsVersions.h"><code>HsVersions.h</code></a>. - - <h2>What Happens When</h2> - <p> - Inside the Haskell compiler proper (<code>hsc</code>), a whole series of - stages (``passes'') are executed in order to transform your Haskell program - into C or native code. This process is orchestrated by - <code>main/HscMain.hscMain</code> and its relative - <code>hscReComp</code>. The latter directly invokes, in order, - the parser, the renamer, the typechecker, the desugarer, the - simplifier (Core2Core), the CoreTidy pass, the CorePrep pass, - conversion to STG (CoreToStg), the interface generator - (MkFinalIface), the code generator, and code output. The - simplifier is the most complex of these, and is made up of many - sub-passes. These are controlled by <code>buildCoreToDo</code>, - as described below. - - <h2>Scheduling Optimisations Phases</h2> - <p> - GHC has a large variety of optimisations at its disposal, many of which - have subtle interdependencies. The overall plan for program - optimisation is fixed in <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/DriverState.hs"><code>DriverState.hs</code></a>. - First of all, there is the variable <code>hsc_minusNoO_flags</code> that - determines the <code>-f</code> options that you get without - <code>-O</code> (aka optimisation level 0) as well as - <code>hsc_minusO_flags</code> and <code>hsc_minusO2_flags</code> for - <code>-O</code> and <code>-O2</code>. - <p> - However, most of the strategic decisions about optimisations on the - intermediate language Core are encoded in the value produced by - <code>buildCoreToDo</code>, which is a list with elements of type - <code>CoreToDo</code>. Each element of this list specifies one step in - the sequence of core optimisations executed by the <a - href="simplifier.html">Mighty Simplifier</a>. The type - <code>CoreToDo</code> is defined in <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/CmdLineOpts.lhs"><code>CmdLineOpts.lhs</code></a>. - The actual execution of the optimisation plan produced by - <code>buildCoreToDo</code> is performed by <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/simplCore/SimplCore.lhs"><code>SimpleCore</code></a><code>.doCorePasses</code>. - Core optimisation plans consist of a number of simplification phases - (currently, three for optimisation levels of 1 or higher) with - decreasing phase numbers (the lowest, corresponding to the last phase, - namely 0). Before and after these phases, optimisations such as - specialisation, let floating, worker/wrapper, and so on are executed. - The sequence of phases is such that the synergistic effect of the phases - is maximised -- however, this is a fairly fragile arrangement. - <p> - There is a similar construction for optimisations on STG level stored in - the variable <code>buildStgToDo :: [StgToDo]</code>. However, this is a - lot less complex than the arrangement for Core optimisations. - - <h2>Linking the <code>RTS</code> and <code>libHSstd</code></h2> - <p> - Since the RTS and HSstd refer to each other, there is a Cunning - Hack to avoid putting them each on the command-line twice or - thrice (aside: try asking for `plaice and chips thrice' in a - fish and chip shop; bet you only get two lots). The hack involves - adding - the symbols that the RTS needs from libHSstd, such as - <code>PrelWeak_runFinalizzerBatch_closure</code> and - <code>__stginit_Prelude</code>, to the link line with the - <code>-u</code> flag. The standard library appears before the - RTS on the link line, and these options cause the corresponding - symbols to be picked up even so the linked might not have seen them - being used as the RTS appears later on the link line. As a result, - when the RTS is also scanned, these symbols are already resolved. This - avoids the linker having to read the standard library and RTS - multiple times. - </p> - <p> - This does, however, leads to a complication. Normal Haskell - programs do not have a <code>main()</code> function, so this is - supplied by the RTS (in the file - <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/rts/Main.c"><code>Main.c</code></a>). - It calls <code>startupHaskell</code>, which - itself calls <code>__stginit_PrelMain</code>, which is therefore, - since it occurs in the standard library, one of the symbols - passed to the linker using the <code>-u</code> option. This is fine - for standalone Haskell programs, but as soon as the Haskell code is only - used as part of a program implemented in a foreign language, the - <code>main()</code> function of that foreign language should be used - instead of that of the Haskell runtime. In this case, the previously - described arrangement unfortunately fails as - <code>__stginit_PrelMain</code> had better not be linked in, - because it tries to call <code>__stginit_Main</code>, which won't - exist. In other words, the RTS's <code>main()</code> refers to - <code>__stginit_PrelMain</code> which in turn refers to - <code>__stginit_Main</code>. Although the RTS's <code>main()</code> - might not be linked in if the program provides its own, the driver - will normally force <code>__stginit_PrelMain</code> to be linked in anyway, - using <code>-u</code>, because it's a back-reference from the - RTS to HSstd. This case is coped with by the <code>-no-hs-main</code> - flag, which suppresses passing the corresonding <code>-u</code> option - to the linker -- although in some versions of the compiler (e.g., 5.00.2) - it didn't work. In addition, the driver generally places the C program - providing the <code>main()</code> that we want to use before the RTS - on the link line. Therefore, the RTS's main is never used and - without the <code>-u</code> the label <code>__stginit_PrelMain</code> - will not be linked. - </p> - - <p><small> -<!-- hhmts start --> -Last modified: Tue Feb 19 11:09:00 UTC 2002 -<!-- hhmts end --> - </small> - </body> -</html> diff --git a/docs/comm/the-beast/fexport.html b/docs/comm/the-beast/fexport.html deleted file mode 100644 index 956043bafb..0000000000 --- a/docs/comm/the-beast/fexport.html +++ /dev/null @@ -1,231 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - foreign export</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>The GHC Commentary - foreign export</h1> - - The implementation scheme for foreign export, as of 27 Feb 02, is - as follows. There are four cases, of which the first two are easy. - <p> - <b>(1) static export of an IO-typed function from some module <code>MMM</code></b> - <p> - <code>foreign export foo :: Int -> Int -> IO Int</code> - <p> - For this we generate no Haskell code. However, a C stub is - generated, and it looks like this: - <p> - <pre> -extern StgClosure* MMM_foo_closure; - -HsInt foo (HsInt a1, HsInt a2) -{ - SchedulerStatus rc; - HaskellObj ret; - rc = rts_evalIO( - rts_apply(rts_apply(MMM_foo_closure,rts_mkInt(a1)), - rts_mkInt(a2) - ), - &ret - ); - rts_checkSchedStatus("foo",rc); - return(rts_getInt(ret)); -} -</pre> - <p> - This does the obvious thing: builds in the heap the expression - <code>(foo a1 a2)</code>, calls <code>rts_evalIO</code> to run it, - and uses <code>rts_getInt</code> to fish out the result. - - <p> - <b>(2) static export of a non-IO-typed function from some module <code>MMM</code></b> - <p> - <code>foreign export foo :: Int -> Int -> Int</code> - <p> - This is identical to case (1), with the sole difference that the - stub calls <code>rts_eval</code> rather than - <code>rts_evalIO</code>. - <p> - - <b>(3) dynamic export of an IO-typed function from some module <code>MMM</code></b> - <p> - <code>foreign export mkCallback :: (Int -> Int -> IO Int) -> IO (FunPtr a)</code> - <p> - Dynamic exports are a whole lot more complicated than their static - counterparts. - <p> - First of all, we get some Haskell code, which, when given a - function <code>callMe :: (Int -> Int -> IO Int)</code> to be made - C-callable, IO-returns a <code>FunPtr a</code>, which is the - address of the resulting C-callable code. This address can now be - handed out to the C-world, and callers to it will get routed - through to <code>callMe</code>. - <p> - The generated Haskell function looks like this: - <p> -<pre> -mkCallback f - = do sp <- mkStablePtr f - r <- ccall "createAdjustorThunk" sp (&"run_mkCallback") - return r -</pre> - <p> - <code>createAdjustorThunk</code> is a gruesome, - architecture-specific function in the RTS. It takes a stable - pointer to the Haskell function to be run, and the address of the - associated C wrapper, and returns a piece of machine code, - which, when called from the outside (C) world, eventually calls - through to <code>f</code>. - <p> - This machine code fragment is called the "Adjustor Thunk" (don't - ask me why). What it does is simply to call onwards to the C - helper - function <code>run_mkCallback</code>, passing all the args given - to it but also conveying <code>sp</code>, which is a stable - pointer - to the Haskell function to run. So: - <p> -<pre> -createAdjustorThunk ( StablePtr sp, CCodeAddress addr_of_helper_C_fn ) -{ - create malloc'd piece of machine code "mc", behaving thusly: - - mc ( args_to_mc ) - { - jump to addr_of_helper_C_fn, passing sp as an additional - argument - } -</pre> - <p> - This is a horrible hack, because there is no portable way, even at - the machine code level, to function which adds one argument and - then transfers onwards to another C function. On x86s args are - pushed R to L onto the stack, so we can just push <code>sp</code>, - fiddle around with return addresses, and jump onwards to the - helper C function. However, on architectures which use register - windows and/or pass args extensively in registers (Sparc, Alpha, - MIPS, IA64), this scheme borders on the unviable. GHC has a - limited <code>createAdjustorThunk</code> implementation for Sparc - and Alpha, which handles only the cases where all args, including - the extra one, fit in registers. - <p> - Anyway: the other lump of code generated as a result of a - f-x-dynamic declaration is the C helper stub. This is basically - the same as in the static case, except that it only ever gets - called from the adjustor thunk, and therefore must accept - as an extra argument, a stable pointer to the Haskell function - to run, naturally enough, as this is not known until run-time. - It then dereferences the stable pointer and does the call in - the same way as the f-x-static case: -<pre> -HsInt Main_d1kv ( StgStablePtr the_stableptr, - void* original_return_addr, - HsInt a1, HsInt a2 ) -{ - SchedulerStatus rc; - HaskellObj ret; - rc = rts_evalIO( - rts_apply(rts_apply((StgClosure*)deRefStablePtr(the_stableptr), - rts_mkInt(a1) - ), - rts_mkInt(a2) - ), - &ret - ); - rts_checkSchedStatus("Main_d1kv",rc); - return(rts_getInt(ret)); -} -</pre> - <p> - Note how this function has a purely made-up name - <code>Main_d1kv</code>, since unlike the f-x-static case, this - function is never called from user code, only from the adjustor - thunk. - <p> - Note also how the function takes a bogus parameter - <code>original_return_addr</code>, which is part of this extra-arg - hack. The usual scheme is to leave the original caller's return - address in place and merely push the stable pointer above that, - hence the spare parameter. - <p> - Finally, there is some extra trickery, detailed in - <code>ghc/rts/Adjustor.c</code>, to get round the following - problem: the adjustor thunk lives in mallocville. It is - quite possible that the Haskell code will actually - call <code>free()</code> on the adjustor thunk used to get to it - -- because otherwise there is no way to reclaim the space used - by the adjustor thunk. That's all very well, but it means that - the C helper cannot return to the adjustor thunk in the obvious - way, since we've already given it back using <code>free()</code>. - So we leave, on the C stack, the address of whoever called the - adjustor thunk, and before calling the helper, mess with the stack - such that when the helper returns, it returns directly to the - adjustor thunk's caller. - <p> - That's how the <code>stdcall</code> convention works. If the - adjustor thunk has been called using the <code>ccall</code> - convention, we return indirectly, via a statically-allocated - yet-another-magic-piece-of-code, which takes care of removing the - extra argument that the adjustor thunk pushed onto the stack. - This is needed because in <code>ccall</code>-world, it is the - caller who removes args after the call, and the original caller of - the adjustor thunk has no way to know about the extra arg pushed - by the adjustor thunk. - <p> - You didn't really want to know all this stuff, did you? - <p> - - - - <b>(4) dynamic export of an non-IO-typed function from some module <code>MMM</code></b> - <p> - <code>foreign export mkCallback :: (Int -> Int -> Int) -> IO (FunPtr a)</code> - <p> - (4) relates to (3) as (2) relates to (1), that is, it's identical, - except the C stub uses <code>rts_eval</code> instead of - <code>rts_evalIO</code>. - <p> - - - <h2>Some perspective on f-x-dynamic</h2> - - The only really horrible problem with f-x-dynamic is how the - adjustor thunk should pass to the C helper the stable pointer to - use. Ideally we would like this to be conveyed via some invisible - side channel, since then the adjustor thunk could simply jump - directly to the C helper, with no non-portable stack fiddling. - <p> - Unfortunately there is no obvious candidate for the invisible - side-channel. We've chosen to pass it on the stack, with the - bad consequences detailed above. Another possibility would be to - park it in a global variable, but this is non-reentrant and - non-(OS-)thread-safe. A third idea is to put it into a callee-saves - register, but that has problems too: the C helper may not use that - register and therefore we will have trashed any value placed there - by the caller; and there is no C-level portable way to read from - the register inside the C helper. - <p> - In short, we can't think of a really satisfactory solution. I'd - vote for introducing some kind of OS-thread-local-state and passing - it in there, but that introduces complications of its own. - <p> - <b>OS-thread-safety</b> is of concern in the C stubs, whilst - building up the expressions to run. These need to have exclusive - access to the heap whilst allocating in it. Also, there needs to - be some guarantee that no GC will happen in between the - <code>deRefStablePtr</code> call and when <code>rts_eval[IO]</code> - starts running. At the moment there are no guarantees for - either property. This needs to be sorted out before the - implementation can be regarded as fully safe to use. - -<p><small> - -<!-- hhmts start --> -Last modified: Weds 27 Feb 02 -<!-- hhmts end --> - </small> - </body> -</html> diff --git a/docs/comm/the-beast/ghci.html b/docs/comm/the-beast/ghci.html deleted file mode 100644 index b893acdeb4..0000000000 --- a/docs/comm/the-beast/ghci.html +++ /dev/null @@ -1,407 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - GHCi</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>The GHC Commentary - GHCi</h1> - - This isn't a coherent description of how GHCi works, sorry. What - it is (currently) is a dumping ground for various bits of info - pertaining to GHCi, which ought to be recorded somewhere. - - <h2>Debugging the interpreter</h2> - - The usual symptom is that some expression / program crashes when - running on the interpreter (commonly), or gets wierd results - (rarely). Unfortunately, finding out what the problem really is - has proven to be extremely difficult. In retrospect it may be - argued a design flaw that GHC's implementation of the STG - execution mechanism provides only the weakest of support for - automated internal consistency checks. This makes it hard to - debug. - <p> - Execution failures in the interactive system can be due to - problems with the bytecode interpreter, problems with the bytecode - generator, or problems elsewhere. From the bugs seen so far, - the bytecode generator is often the culprit, with the interpreter - usually being correct. - <p> - Here are some tips for tracking down interactive nonsense: - <ul> - <li>Find the smallest source fragment which causes the problem. - <p> - <li>Using an RTS compiled with <code>-DDEBUG</code> (nb, that - means the RTS from the previous stage!), run with <code>+RTS - -D2</code> to get a listing in great detail from the - interpreter. Note that the listing is so voluminous that - this is impractical unless you have been diligent in - the previous step. - <p> - <li>At least in principle, using the trace and a bit of GDB - poking around at the time of death, you can figure out what - the problem is. In practice you quickly get depressed at - the hopelessness of ever making sense of the mass of - details. Well, I do, anyway. - <p> - <li><code>+RTS -D2</code> tries hard to print useful - descriptions of what's on the stack, and often succeeds. - However, it has no way to map addresses to names in - code/data loaded by our runtime linker. So the C function - <code>ghci_enquire</code> is provided. Given an address, it - searches the loaded symbol tables for symbols close to that - address. You can run it from inside GDB: - <pre> - (gdb) p ghci_enquire ( 0x50a406f0 ) - 0x50a406f0 + -48 == `PrelBase_Czh_con_info' - 0x50a406f0 + -12 == `PrelBase_Izh_static_info' - 0x50a406f0 + -48 == `PrelBase_Czh_con_entry' - 0x50a406f0 + -24 == `PrelBase_Izh_con_info' - 0x50a406f0 + 16 == `PrelBase_ZC_con_entry' - 0x50a406f0 + 0 == `PrelBase_ZMZN_static_entry' - 0x50a406f0 + -36 == `PrelBase_Czh_static_entry' - 0x50a406f0 + -24 == `PrelBase_Izh_con_entry' - 0x50a406f0 + 64 == `PrelBase_EQ_static_info' - 0x50a406f0 + 0 == `PrelBase_ZMZN_static_info' - 0x50a406f0 + 48 == `PrelBase_LT_static_entry' - $1 = void - </pre> - In this case the enquired-about address is - <code>PrelBase_ZMZN_static_entry</code>. If no symbols are - close to the given addr, nothing is printed. Not a great - mechanism, but better than nothing. - <p> - <li>We have had various problems in the past due to the bytecode - generator (<code>compiler/ghci/ByteCodeGen.lhs</code>) being - confused about the true set of free variables of an - expression. The compilation scheme for <code>let</code>s - applies the BCO for the RHS of the let to its free - variables, so if the free-var annotation is wrong or - misleading, you end up with code which has wrong stack - offsets, which is usually fatal. - <p> - <li>The baseline behaviour of the interpreter is to interpret - BCOs, and hand all other closures back to the scheduler for - evaluation. However, this causes a huge number of expensive - context switches, so the interpreter knows how to enter the - most common non-BCO closure types by itself. - <p> - These optimisations complicate the interpreter. - If you think you have an interpreter problem, re-enable the - define <code>REFERENCE_INTERPRETER</code> in - <code>ghc/rts/Interpreter.c</code>. All optimisations are - thereby disabled, giving the baseline - I-only-know-how-to-enter-BCOs behaviour. - <p> - <li>Following the traces is often problematic because execution - hops back and forth between the interpreter, which is - traced, and compiled code, which you can't see. - Particularly annoying is when the stack looks OK in the - interpreter, then compiled code runs for a while, and later - we arrive back in the interpreter, with the stack corrupted, - and usually in a completely different place from where we - left off. - <p> - If this is biting you baaaad, it may be worth copying - sources for the compiled functions causing the problem, into - your interpreted module, in the hope that you stay in the - interpreter more of the time. Of course this doesn't work - very well if you've defined - <code>REFERENCE_INTERPRETER</code> in - <code>ghc/rts/Interpreter.c</code>. - <p> - <li>There are various commented-out pieces of code in - <code>Interpreter.c</code> which can be used to get the - stack sanity-checked after every entry, and even after after - every bytecode instruction executed. Note that some - bytecodes (<code>PUSH_UBX</code>) leave the stack in - an unwalkable state, so the <code>do_print_stack</code> - local variable is used to suppress the stack walk after - them. - </ul> - - - <h2>Useful stuff to know about the interpreter</h2> - - The code generation scheme is straightforward (naive, in fact). - <code>-ddump-bcos</code> prints each BCO along with the Core it - was generated from, which is very handy. - <ul> - <li>Simple lets are compiled in-line. For the general case, let - v = E in ..., E is compiled into a new BCO which takes as - args its free variables, and v is bound to AP(the new BCO, - free vars of E). - <p> - <li><code>case</code>s as usual, become: push the return - continuation, enter the scrutinee. There is some magic to - make all combinations of compiled/interpreted calls and - returns work, described below. In the interpreted case, all - case alts are compiled into a single big return BCO, which - commences with instructions implementing a switch tree. - </ul> - <p> - <b>ARGCHECK magic</b> - <p> - You may find ARGCHECK instructions at the start of BCOs which - don't appear to need them; case continuations in particular. - These play an important role: they force objects which should - evaluated to BCOs to actually be BCOs. - <p> - Typically, there may be an application node somewhere in the heap. - This is a thunk which when leant on turns into a BCO for a return - continuation. The thunk may get entered with an update frame on - top of the stack. This is legitimate since from one viewpoint - this is an AP which simply reduces to a data object, so does not - have functional type. However, once the AP turns itself into a - BCO (so to speak) we cannot simply enter the BCO, because that - expects to see args on top of the stack, not an update frame. - Therefore any BCO which expects something on the stack above an - update frame, even non-function BCOs, start with an ARGCHECK. In - this case it fails, the update is done, the update frame is - removed, and the BCO re-entered. Subsequent entries of the BCO of - course go unhindered. - <p> - The optimised (<code>#undef REFERENCE_INTERPRETER</code>) handles - this case specially, so that a trip through the scheduler is - avoided. When reading traces from <code>+RTS -D2 -RTS</code>, you - may see BCOs which appear to execute their initial ARGCHECK insn - twice. The first time it fails; the interpreter does the update - immediately and re-enters with no further comment. - <p> - This is all a bit ugly, and, as SimonM correctly points out, it - would have been cleaner to make BCOs unpointed (unthunkable) - objects, so that a pointer to something <code>:: BCO#</code> - really points directly at a BCO. - <p> - <b>Stack management</b> - <p> - There isn't any attempt to stub the stack, minimise its growth, or - generally remove unused pointers ahead of time. This is really - due to lazyness on my part, although it does have the minor - advantage that doing something cleverer would almost certainly - increase the number of bytecodes that would have to be executed. - Of course we SLIDE out redundant stuff, to get the stack back to - the sequel depth, before returning a HNF, but that's all. As - usual this is probably a cause of major space leaks. - <p> - <b>Building constructors</b> - <p> - Constructors are built on the stack and then dumped into the heap - with a single PACK instruction, which simply copies the top N - words of the stack verbatim into the heap, adds an info table, and zaps N - words from the stack. The constructor args are pushed onto the - stack one at a time. One upshot of this is that unboxed values - get pushed untaggedly onto the stack (via PUSH_UBX), because that's how they - will be in the heap. That in turn means that the stack is not - always walkable at arbitrary points in BCO execution, although - naturally it is whenever GC might occur. - <p> - Function closures created by the interpreter use the AP-node - (tagged) format, so although their fields are similarly - constructed on the stack, there is never a stack walkability - problem. - <p> - <b>Unpacking constructors</b> - <p> - At the start of a case continuation, the returned constructor is - unpacked onto the stack, which means that unboxed fields have to - be tagged. Rather than burdening all such continuations with a - complex, general mechanism, I split it into two. The - allegedly-common all-pointers case uses a single UNPACK insn - to fish out all fields with no further ado. The slow case uses a - sequence of more complex UPK_TAG insns, one for each field (I - think). This seemed like a good compromise to me. - <p> - <b>Perspective</b> - <p> - I designed the bytecode mechanism with the experience of both STG - hugs and Classic Hugs in mind. The latter has an small - set of bytecodes, a small interpreter loop, and runs amazingly - fast considering the cruddy code it has to interpret. The former - had a large interpretative loop with many different opcodes, - including multiple minor variants of the same thing, which - made it difficult to optimise and maintain, yet it performed more - or less comparably with Classic Hugs. - <p> - My design aims were therefore to minimise the interpreter's - complexity whilst maximising performance. This means reducing the - number of opcodes implemented, whilst reducing the number of insns - despatched. In particular there are only two opcodes, PUSH_UBX - and UPK_TAG, which deal with tags. STG Hugs had dozens of opcodes - for dealing with tagged data. In cases where the common - all-pointers case is significantly simpler (UNPACK) I deal with it - specially. Finally, the number of insns executed is reduced a - little by merging multiple pushes, giving PUSH_LL and PUSH_LLL. - These opcode pairings were determined by using the opcode-pair - frequency profiling stuff which is ifdef-d out in - <code>Interpreter.c</code>. These significantly improve - performance without having much effect on the uglyness or - complexity of the interpreter. - <p> - Overall, the interpreter design is something which turned out - well, and I was pleased with it. Unfortunately I cannot say the - same of the bytecode generator. - - <h2><code>case</code> returns between interpreted and compiled code</h2> - - Variants of the following scheme have been drifting around in GHC - RTS documentation for several years. Since what follows is - actually what is implemented, I guess it supersedes all other - documentation. Beware; the following may make your brain melt. - In all the pictures below, the stack grows downwards. - <p> - <b>Returning to interpreted code</b>. - <p> - Interpreted returns employ a set of polymorphic return infotables. - Each element in the set corresponds to one of the possible return - registers (R1, D1, F1) that compiled code will place the returned - value in. In fact this is a bit misleading, since R1 can be used - to return either a pointer or an int, and we need to distinguish - these cases. So, supposing the set of return registers is {R1p, - R1n, D1, F1}, there would be four corresponding infotables, - <code>stg_ctoi_ret_R1p_info</code>, etc. In the pictures below we - call them <code>stg_ctoi_ret_REP_info</code>. - <p> - These return itbls are polymorphic, meaning that all 8 vectored - return codes and the direct return code are identical. - <p> - Before the scrutinee is entered, the stack is arranged like this: - <pre> - | | - +--------+ - | BCO | -------> the return contination BCO - +--------+ - | itbl * | -------> stg_ctoi_ret_REP_info, with all 9 codes as follows: - +--------+ - BCO* bco = Sp[1]; - push R1/F1/D1 depending on REP - push bco - yield to sched - </pre> - On entry, the interpreted contination BCO expects the stack to look - like this: - <pre> - | | - +--------+ - | BCO | -------> the return contination BCO - +--------+ - | itbl * | -------> ret_REP_ctoi_info, with all 9 codes as follows: - +--------+ - : VALUE : (the returned value, shown with : since it may occupy - +--------+ multiple stack words) - </pre> - A machine code return will park the returned value in R1/F1/D1, - and enter the itbl on the top of the stack. Since it's our magic - itbl, this pushes the returned value onto the stack, which is - where the interpreter expects to find it. It then pushes the BCO - (again) and yields. The scheduler removes the BCO from the top, - and enters it, so that the continuation is interpreted with the - stack as shown above. - <p> - An interpreted return will create the value to return at the top - of the stack. It then examines the return itbl, which must be - immediately underneath the return value, to see if it is one of - the magic <code>stg_ctoi_ret_REP_info</code> set. Since this is so, - it knows it is returning to an interpreted contination. It - therefore simply enters the BCO which it assumes it immediately - underneath the itbl on the stack. - - <p> - <b>Returning to compiled code</b>. - <p> - Before the scrutinee is entered, the stack is arranged like this: - <pre> - ptr to vec code 8 ------> return vector code 8 - | | .... - +--------+ ptr to vec code 1 ------> return vector code 1 - | itbl * | -- Itbl end - +--------+ \ .... - \ Itbl start - ----> direct return code - </pre> - The scrutinee value is then entered. - The case continuation(s) expect the stack to look the same, with - the returned HNF in a suitable return register, R1, D1, F1 etc. - <p> - A machine code return knows whether it is doing a vectored or - direct return, and, if the former, which vector element it is. - So, for a direct return we jump to <code>Sp[0]</code>, and for a - vectored return, jump to <code>((CodePtr*)(Sp[0]))[ - ITBL_LENGTH - - vector number ]</code>. This is (of course) the scheme that - compiled code has been using all along. - <p> - An interpreted return will, as described just above, have examined - the itbl immediately beneath the return value it has just pushed, - and found it not to be one of the <code>ret_REP_ctoi_info</code> set, - so it knows this must be a return to machine code. It needs to - pop the return value, currently on the stack, into R1/F1/D1, and - jump through the info table. Unfortunately the first part cannot - be accomplished directly since we are not in Haskellised-C world. - <p> - We therefore employ a second family of magic infotables, indexed, - like the first, on the return representation, and therefore with - names of the form <code>stg_itoc_ret_REP_info</code>. (Note: - <code>itoc</code>; the previous bunch were <code>ctoi</code>). - This is pushed onto the stack (note, tagged values have their tag - zapped), giving: - <pre> - | | - +--------+ - | itbl * | -------> arbitrary machine code return itbl - +--------+ - : VALUE : (the returned value, possibly multiple words) - +--------+ - | itbl * | -------> stg_itoc_ret_REP_info, with code: - +--------+ - pop myself (stg_itoc_ret_REP_info) off the stack - pop return value into R1/D1/F1 - do standard machine code return to itbl at t.o.s. - </pre> - We then return to the scheduler, asking it to enter the itbl at - t.o.s. When entered, <code>stg_itoc_ret_REP_info</code> removes - itself from the stack, pops the return value into the relevant - return register, and returns to the itbl to which we were trying - to return in the first place. - <p> - Amazingly enough, this stuff all actually works! Well, mostly ... - <p> - <b>Unboxed tuples: a Right Royal Spanner In The Works</b> - <p> - The above scheme depends crucially on having magic infotables - <code>stg_{itoc,ctoi}_ret_REP_info</code> for each return - representation <code>REP</code>. It unfortunately fails miserably - in the face of unboxed tuple returns, because the set of required - tables would be infinite; this despite the fact that for any given - unboxed tuple return type, the scheme could be made to work fine. - <p> - This is a serious problem, because it prevents interpreted - code from doing <code>IO</code>-typed returns, since <code>IO - t</code> is implemented as <code>(# t, RealWorld# #)</code> or - thereabouts. This restriction in turn rules out FFI stuff in the - interpreter. Not good. - <p> - Although we have no way to make general unboxed tuples work, we - can at least make <code>IO</code>-types work using the following - ultra-kludgey observation: <code>RealWorld#</code> doesn't really - exist and so has zero size, in compiled code. In turn this means - that a type of the form <code>(# t, RealWorld# #)</code> has the - same representation as plain <code>t</code> does. So the bytecode - generator, whilst rejecting code with general unboxed tuple - returns, recognises and accepts this special case. Which means - that <code>IO</code>-typed stuff works in the interpreter. Just. - <p> - If anyone asks, I will claim I was out of radio contact, on a - 6-month walking holiday to the south pole, at the time this was - ... er ... dreamt up. - - -<p><small> - -<!-- hhmts start --> -Last modified: Thursday February 7 15:33:49 GMT 2002 -<!-- hhmts end --> - </small> - </body> -</html> diff --git a/docs/comm/the-beast/main.html b/docs/comm/the-beast/main.html deleted file mode 100644 index 332ffaa501..0000000000 --- a/docs/comm/the-beast/main.html +++ /dev/null @@ -1,35 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - Compiling and running the Main module</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>Compiling and running the Main module</h1> - -GHC allows you to determine which module contains the "main" function, and -what that function is called, via the <code>-fmain-is</code> flag. The trouble is -that the runtime system is fixed, so what symbol should it link to? -<p> -The current solution is this. Suppose the main function is <code>Foo.run</code>. -<ul> -<li> -Then, when compiling module <code>Foo</code>, GHC adds an extra definition: -<pre> - :Main.main = runIO Foo.run -</pre> -Now the RTS can invoke <code>:Main.main</code> to start the program. (This extra -definition is inserted in TcRnDriver.checkMain.) -<p><li> -Before starting the program, though, the RTS also initialises the module tree -by calling <code>init_:Main</code>, so when compiling the main module (Foo in this case), -as well as generating <code>init_Foo</code> as usual, GHC also generates -<pre> - init_zcMain() { init_Foo; } -</pre> -This extra initialisation code is generated in CodeGen.mkModuleInit. -</ul> - - </body> -</html> diff --git a/docs/comm/the-beast/mangler.html b/docs/comm/the-beast/mangler.html deleted file mode 100644 index 1ad80f0d5c..0000000000 --- a/docs/comm/the-beast/mangler.html +++ /dev/null @@ -1,79 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - The Evil Mangler</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>The GHC Commentary - The Evil Mangler</h1> - <p> - The Evil Mangler (EM) is a Perl script invoked by the <a - href="driver.html">Glorious Driver</a> after the C compiler (gcc) has - translated the GHC-produced C code into assembly. Consequently, it is - only of interest if <code>-fvia-C</code> is in effect (either explicitly - or implicitly). - - <h4>Its purpose</h4> - <p> - The EM reads the assembly produced by gcc and re-arranges code blocks as - well as nukes instructions that it considers <em>non-essential.</em> It - derives it evilness from its utterly ad hoc, machine, compiler, and - whatnot dependent design and implementation. More precisely, the EM - performs the following tasks: - <ul> - <li>The code executed when a closure is entered is moved adjacent to - that closure's infotable. Moreover, the order of the info table - entries is reversed. Also, SRT pointers are removed from closures that - don't need them (non-FUN, RET and THUNK ones). - <li>Function prologue and epilogue code is removed. (GHC generated code - manages its own stack and uses the system stack only for return - addresses and during calls to C code.) - <li>Certain code patterns are replaced by simpler code (eg, loads of - fast entry points followed by indirect jumps are replaced by direct - jumps to the fast entry point). - </ul> - - <h4>Implementation</h4> - <p> - The EM is located in the Perl script <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/driver/mangler/ghc-asm.lprl"><code>ghc-asm.lprl</code></a>. - The script reads the <code>.s</code> file and chops it up into - <em>chunks</em> (that's how they are actually called in the script) that - roughly correspond to basic blocks. Each chunk is annotated with an - educated guess about what kind of code it contains (e.g., infotable, - fast entry point, slow entry point, etc.). The annotations also contain - the symbol introducing the chunk of assembly and whether that chunk has - already been processed or not. - <p> - The parsing of the input into chunks as well as recognising assembly - instructions that are to be removed or altered is based on a large - number of Perl regular expressions sprinkled over the whole code. These - expressions are rather fragile as they heavily rely on the structure of - the generated code - in fact, they even rely on the right amount of - white space and thus on the formatting of the assembly. - <p> - Afterwards, the chunks are reordered, some of them purged, and some - stripped of some useless instructions. Moreover, some instructions are - manipulated (eg, loads of fast entry points followed by indirect jumps - are replaced by direct jumps to the fast entry point). - <p> - The EM knows which part of the code belongs to function prologues and - epilogues as <a href="../rts-libs/stgc.html">STG C</a> adds tags of the - form <code>--- BEGIN ---</code> and <code>--- END ---</code> the - assembler just before and after the code proper of a function starts. - It adds these tags using gcc's <code>__asm__</code> feature. - <p> - <strong>Update:</strong> Gcc 2.96 upwards performs more aggressive basic - block re-ordering and dead code elimination. This seems to make the - whole <code>--- END ---</code> tag business redundant -- in fact, if - proper code is generated, no <code>--- END ---</code> tags survive gcc - optimiser. - - <p><small> -<!-- hhmts start --> -Last modified: Sun Feb 17 17:55:47 EST 2002 -<!-- hhmts end --> - </small> - </body> -</html> diff --git a/docs/comm/the-beast/modules.html b/docs/comm/the-beast/modules.html deleted file mode 100644 index a6655a68a7..0000000000 --- a/docs/comm/the-beast/modules.html +++ /dev/null @@ -1,80 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - Modules, ModuleNames and Packages</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>Modules, ModuleNames and Packages</h1> - - <p>This section describes the datatypes <code>ModuleName</code> - <code>Module</code> and <code>PackageName</code> all available - from the module <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/Module.lhs"><code>Module</code></a>.<p> - - <h2>Packages</h2> - - <p>A package is a collection of (zero or more) Haskell modules, - together with some information about external libraries, extra C - compiler options, and other things that this collection of modules - requires. When using DLLs on windows (or shared libraries on a - Unix system; currently unsupported), a package can consist of only - a single shared library of Haskell code; the reason for this is - described below. - - <p>Packages are further described in the User's Guide <a - href="http://www.haskell.org/ghc/docs/latest/packages.html">here</a>. - - <h2>The ModuleName type</h2> - - <p>At the bottom of the hierarchy is a <code>ModuleName</code>, - which, as its name suggests, is simply the name of a module. It - is represented as a Z-encoded FastString, and is an instance of - <code>Uniquable</code> so we can build <code>FiniteMap</code>s - with <code>ModuleName</code>s as the keys. - - <p>A <code>ModuleName</code> can be built from a - <code>String</code>, using the <code>mkModuleName</code> function. - - <h2>The Module type</h2> - - <p>For a given module, the compiler also needs to know whether the - module is in the <em>home package</em>, or in another package. - This distinction is important for two reasons: - - <ul> - <li><p>When generating code to call a function in another package, - the compiler might have to generate a cross-DLL call, which is - different from an intra-DLL call (hence the restriction that the - code in a package can only reside in a single DLL). - - <li><p>We avoid putting version information in an interface file - for entities defined in another package, on the grounds that other - packages are generally "stable". This also helps keep the size of - interface files down. - </ul> - - <p>The <code>Module</code> type contains a <code>ModuleName</code> - and a <code>PackageInfo</code> field. The - <code>PackageInfo</code> indicates whether the given - <code>Module</code> comes from the current package or from another - package. - - <p>To get the actual package in which a given module resides, you - have to read the interface file for that module, which contains - the package name (actually the value of the - <code>-package-name</code> flag when that module was built). This - information is currently unused inside the compiler, but we might - make use of it in the future, especially with the advent of - hierarchical modules, to allow the compiler to automatically - figure out which packages a program should be linked with, and - thus avoid the need to specify <code>-package</code> options on - the command line. - - <p><code>Module</code>s are also instances of - <code>Uniquable</code>, and indeed the unique of a - <code>Module</code> is the same as the unique of the underlying - <code>ModuleName</code>. - </body> -</html> diff --git a/docs/comm/the-beast/names.html b/docs/comm/the-beast/names.html deleted file mode 100644 index 061fae3ebf..0000000000 --- a/docs/comm/the-beast/names.html +++ /dev/null @@ -1,169 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - The truth about names: OccNames, and Names</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>The GHC Commentary - The truth about names: OccNames, and Names</h1> - <p> - Every entity (type constructor, class, identifier, type variable) has a - <code>Name</code>. The <code>Name</code> type is pervasive in GHC, and - is defined in <code>basicTypes/Name.lhs</code>. Here is what a Name - looks like, though it is private to the Name module. - </p> - <blockquote> - <pre> -data Name = Name { - n_sort :: NameSort, -- What sort of name it is - n_occ :: !OccName, -- Its occurrence name - n_uniq :: Unique, -- Its identity - n_loc :: !SrcLoc -- Definition site - }</pre> - </blockquote> - <ul> - <li> The <code>n_sort</code> field says what sort of name this is: see - <a href="#sort">NameSort below</a>. - <li> The <code>n_occ</code> field gives the "occurrence name" of the - Name; see - <a href="#occname">OccName below</a>. - <li> The <code>n_uniq</code> field allows fast tests for equality of - Names. - <li> The <code>n_loc</code> field gives some indication of where the - name was bound. - </ul> - - <h2><a name="sort">The <code>NameSort</code> of a <code>Name</code></a></h2> - <p> - There are four flavours of <code>Name</code>: - </p> - <blockquote> - <pre> -data NameSort - = External Module (Maybe Name) - -- (Just parent) => this Name is a subordinate name of 'parent' - -- e.g. data constructor of a data type, method of a class - -- Nothing => not a subordinate - - | WiredIn Module (Maybe Name) TyThing BuiltInSyntax - -- A variant of External, for wired-in things - - | Internal -- A user-defined Id or TyVar - -- defined in the module being compiled - - | System -- A system-defined Id or TyVar. Typically the - -- OccName is very uninformative (like 's')</pre> - </blockquote> - <ul> - <li>Here are the sorts of Name an entity can have: - <ul> - <li> Class, TyCon: External. - <li> Id: External, Internal, or System. - <li> TyVar: Internal, or System. - </ul> - </li> - <li>An <code>External</code> name has a globally-unique - (module name, occurrence name) pair, namely the - <em>original name</em> of the entity, - describing where the thing was originally defined. So for example, - if we have - <blockquote> - <pre> -module M where - f = e1 - g = e2 - -module A where - import qualified M as Q - import M - a = Q.f + g</pre> - </blockquote> - <p> - then the RdrNames for "a", "Q.f" and "g" get replaced (by the - Renamer) by the Names "A.a", "M.f", and "M.g" respectively. - </p> - </li> - <li>An <code>InternalName</code> - has only an occurrence name. Distinct InternalNames may have the same - occurrence name; use the Unique to distinguish them. - </li> - <li>An <code>ExternalName</code> has a unique that never changes. It - is never cloned. This is important, because the simplifier invents - new names pretty freely, but we don't want to lose the connnection - with the type environment (constructed earlier). An - <code>InternalName</code> name can be cloned freely. - </li> - <li><strong>Before CoreTidy</strong>: the Ids that were defined at top - level in the original source program get <code>ExternalNames</code>, - whereas extra top-level bindings generated (say) by the type checker - get <code>InternalNames</code>. q This distinction is occasionally - useful for filtering diagnostic output; e.g. for -ddump-types. - </li> - <li><strong>After CoreTidy</strong>: An Id with an - <code>ExternalName</code> will generate symbols that - appear as external symbols in the object file. An Id with an - <code>InternalName</code> cannot be referenced from outside the - module, and so generates a local symbol in the object file. The - CoreTidy pass makes the decision about which names should be External - and which Internal. - </li> - <li>A <code>System</code> name is for the most part the same as an - <code>Internal</code>. Indeed, the differences are purely cosmetic: - <ul> - <li>Internal names usually come from some name the - user wrote, whereas a System name has an OccName like "a", or "t". - Usually there are masses of System names with the same OccName but - different uniques, whereas typically there are only a handful of - distince Internal names with the same OccName. - </li> - <li>Another difference is that when unifying the type checker tries - to unify away type variables with System names, leaving ones with - Internal names (to improve error messages). - </li> - </ul> - </li> - </ul> - - <h2><a name="occname">Occurrence names: <code>OccName</code></a></h2> - <p> - An <code>OccName</code> is more-or-less just a string, like "foo" or - "Tree", giving the (unqualified) name of an entity. - </p> - <p> - Well, not quite just a string, because in Haskell a name like "C" could - mean a type constructor or data constructor, depending on context. So - GHC defines a type <tt>OccName</tt> (defined in - <tt>basicTypes/OccName.lhs</tt>) that is a pair of a <tt>FastString</tt> - and a <tt>NameSpace</tt> indicating which name space the name is drawn - from: - <blockquote> - <pre> -data OccName = OccName NameSpace EncodedFS</pre> - </blockquote> - <p> - The <tt>EncodedFS</tt> is a synonym for <tt>FastString</tt> indicating - that the string is Z-encoded. (Details in <tt>OccName.lhs</tt>.) - Z-encoding encodes funny characters like '%' and '$' into alphabetic - characters, like "zp" and "zd", so that they can be used in object-file - symbol tables without confusing linkers and suchlike. - </p> - <p> - The name spaces are: - </p> - <ul> - <li> <tt>VarName</tt>: ordinary variables</li> - <li> <tt>TvName</tt>: type variables</li> - <li> <tt>DataName</tt>: data constructors</li> - <li> <tt>TcClsName</tt>: type constructors and classes (in Haskell they - share a name space) </li> - </ul> - - <small> -<!-- hhmts start --> -Last modified: Wed May 4 14:57:55 EST 2005 -<!-- hhmts end --> - </small> - </body> -</html> - diff --git a/docs/comm/the-beast/ncg.html b/docs/comm/the-beast/ncg.html deleted file mode 100644 index 84beac2d51..0000000000 --- a/docs/comm/the-beast/ncg.html +++ /dev/null @@ -1,749 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - The Native Code Generator</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>The GHC Commentary - The Native Code Generator</h1> - <p> - On some platforms (currently x86 and PowerPC, with bitrotted - support for Sparc and Alpha), GHC can generate assembly code - directly, without having to go via C. This can sometimes almost - halve compilation time, and avoids the fragility and - horribleness of the <a href="mangler.html">mangler</a>. The NCG - is enabled by default for - non-optimising compilation on supported platforms. For most programs - it generates code which runs only 1-3% slower - (depending on platform and type of code) than that - created by gcc on x86s, so it is well worth using even with - optimised compilation. FP-intensive x86 programs see a bigger - slowdown, and all Sparc code runs about 5% slower due to - us not filling branch delay slots. - <p> - The NCG has always been something of a second-class citizen - inside GHC, an unloved child, rather. This means that its - integration into the compiler as a whole is rather clumsy, which - brings some problems described below. That apart, the NCG - proper is fairly cleanly designed, as target-independent as it - reasonably can be, and so should not be difficult to retarget. - <p> - <b>NOTE!</b> The native code generator was largely rewritten as part - of the C-- backend changes, around May 2004. Unfortunately the - rest of this document still refers to the old version, and was written - with relation to the CVS head as of end-Jan 2002. Some of it is relevant, - some of it isn't. - - <h2>Overview</h2> - The top-level code generator fn is - <p> - <code>absCtoNat :: AbstractC -> UniqSM (SDoc, Pretty.Doc)</code> - <p> - The returned <code>SDoc</code> is for debugging, so is empty unless - you specify <code>-ddump-stix</code>. The <code>Pretty.Doc</code> - bit is the final assembly code. Translation involves three main - phases, the first and third of which are target-independent. - <ul> - <li><b>Translation into the <code>Stix</code> representation.</b> Stix - is a simple tree-like RTL-style language, in which you can - mention: - <p> - <ul> - <li>An infinite number of temporary, virtual registers. - <li>The STG "magic" registers (<code>MagicId</code>), such as - the heap and stack pointers. - <li>Literals and low-level machine ops (<code>MachOp</code>). - <li>Simple address computations. - <li>Reads and writes of: memory, virtual regs, and various STG - regs. - <li>Labels and <code>if ... goto ...</code> style control-flow. - </ul> - <p> - Stix has two main associated types: - <p> - <ul> - <li><code>StixStmt</code> -- trees executed for their side - effects: assignments, control transfers, and auxiliary junk - such as segment changes and literal data. - <li><code>StixExpr</code> -- trees which denote a value. - </ul> - <p> - Translation into Stix is almost completely - target-independent. Needed dependencies are knowledge of - word size and endianness, used when generating code to do - deal with half-word fields in info tables. This could be - abstracted out easily enough. Also, the Stix translation - needs to know which <code>MagicId</code>s map to registers - on the given target, and which are stored in offsets from - <code>BaseReg</code>. - <p> - After initial Stix generation, the trees are cleaned up with - constant-folding and a little copy-propagation ("Stix - inlining", as the code misleadingly calls it). We take - the opportunity to translate <code>MagicId</code>s which are - stored in memory on the given target, into suitable memory - references. Those which are stored in registers are left - alone. There is also a half-hearted attempt to lift literal - strings to the top level in cases where nested strings have - been observed to give incorrect code in the past. - <p> - Primitive machine-level operations will already be phrased in - terms of <code>MachOp</code>s in the presented Abstract C, and - these are passed through unchanged. We comment only that the - <code>MachOp</code>s have been chosen so as to be easy to - implement on all targets, and their meaning is intended to be - unambiguous, and the same on all targets, regardless of word - size or endianness. - <p> - <b>A note on <code>MagicId</code>s.</b> - Those which are assigned to - registers on the current target are left unmodified. Those - which are not are stored in memory as offsets from - <code>BaseReg</code> (which is assumed to permanently have the - value <code>(&MainCapability.r)</code>), so the constant folder - calculates the offsets and inserts suitable loads/stores. One - complication is that not all archs have <code>BaseReg</code> - itself in a register, so for those (sparc), we instead - generate the address as an offset from the static symbol - <code>MainCapability</code>, since the register table lives in - there. - <p> - Finally, <code>BaseReg</code> does occasionally itself get - mentioned in Stix expression trees, and in this case what is - denoted is precisely <code>(&MainCapability.r)</code>, not, as - in all other cases, the value of memory at some offset from - the start of the register table. Since what it denotes is an - r-value and not an l-value, assigning <code>BaseReg</code> is - meaningless, so the machinery checks to ensure this never - happens. All these details are taken into account by the - constant folder. - <p> - <li><b>Instruction selection.</b> This is the only majorly - target-specific phase. It turns Stix statements and - expressions into sequences of <code>Instr</code>, a data - type which is different for each architecture. - <code>Instr</code>, unsurprisingly, has various supporting - types, such as <code>Reg</code>, <code>Operand</code>, - <code>Imm</code>, etc. The generated instructions may refer - to specific machine registers, or to arbitrary virtual - registers, either those created within the instruction - selector, or those mentioned in the Stix passed to it. - <p> - The instruction selectors live in <code>MachCode.lhs</code>. - The core functions, for each target, are: - <p> - <code> - getAmode :: StixExpr -> NatM Amode - <br>getRegister :: StixExpr -> NatM Register - <br>assignMem_IntCode :: PrimRep -> StixExpr -> StixExpr -> NatM InstrBlock - <br>assignReg_IntCode :: PrimRep -> StixReg -> StixExpr -> NatM InstrBlock - </code> - <p> - The insn selectors use the "maximal munch" algorithm. The - bizarrely-misnamed <code>getRegister</code> translates - expressions. A simplified version of its type is: - <p> - <code>getRegister :: StixExpr -> NatM (OrdList Instr, Reg)</code> - <p> - That is: it (monadically) turns a <code>StixExpr</code> into a - sequence of instructions, and a register, with the meaning - that after executing the (possibly empty) sequence of - instructions, the (possibly virtual) register will - hold the resulting value. The real situation is complicated - by the presence of fixed registers, and is detailed below. - <p> - Maximal munch is a greedy algorithm and is known not to give - globally optimal code sequences, but it is good enough, and - fast and simple. Early incarnations of the NCG used something - more sophisticated, but that is long gone now. - <p> - Similarly, <code>getAmode</code> translates a value, intended - to denote an address, into a sequence of insns leading up to - a (processor-specific) addressing mode. This stuff could be - done using the general <code>getRegister</code> selector, but - would necessarily generate poorer code, because the calculated - address would be forced into a register, which might be - unnecessary if it could partially or wholly be calculated - using an addressing mode. - <p> - Finally, <code>assignMem_IntCode</code> and - <code>assignReg_IntCode</code> create instruction sequences to - calculate a value and store it in the given register, or at - the given address. Because these guys translate a statement, - not a value, they just return a sequence of insns and no - associated register. Floating-point and 64-bit integer - assignments have analogous selectors. - <p> - Apart from the complexities of fixed vs floating registers, - discussed below, the instruction selector is as simple - as it can be. It looks long and scary but detailed - examination reveals it to be fairly straightforward. - <p> - <li><b>Register allocation.</b> The register allocator, - <code>AsmRegAlloc.lhs</code> takes sequences of - <code>Instr</code>s which mention a mixture of real and - virtual registers, and returns a modified sequence referring - only to real ones. It is gloriously and entirely - target-independent. Well, not exactly true. Instead it - regards <code>Instr</code> (instructions) and <code>Reg</code> - (virtual and real registers) as abstract types, to which it has - the following interface: - <p> - <code> - insnFuture :: Instr -> InsnFuture - <br>regUsage :: Instr -> RegUsage - <br>patchRegs :: Instr -> (Reg -> Reg) -> Instr - </code> - <p> - <code>insnFuture</code> is used to (re)construct the graph of - all possible control transfers between the insns to be - allocated. <code>regUsage</code> returns the sets of registers - read and written by an instruction. And - <code>patchRegs</code> is used to apply the allocator's final - decision on virtual-to-real reg mapping to an instruction. - <p> - Clearly these 3 fns have to be written anew for each - architecture. They are defined in - <code>RegAllocInfo.lhs</code>. Think twice, no, thrice, - before modifying them: making false claims about insn - behaviour will lead to hard-to-find register allocation - errors. - <p> - <code>AsmRegAlloc.lhs</code> contains detailed comments about - how the allocator works. Here is a summary. The head honcho - <p> - <code>allocUsingTheseRegs :: [Instr] -> [Reg] -> (Bool, [Instr])</code> - <p> - takes a list of instructions and a list of real registers - available for allocation, and maps as many of the virtual regs - in the input into real ones as it can. The returned - <code>Bool</code> indicates whether or not it was - successful. If so, that's the end of it. If not, the caller - of <code>allocUsingTheseRegs</code> will attempt spilling. - More of that later. What <code>allocUsingTheseRegs</code> - does is: - <p> - <ul> - <li>Implicitly number each instruction by its position in the - input list. - <p> - <li>Using <code>insnFuture</code>, create the set of all flow - edges -- possible control transfers -- within this set of - insns. - <p> - <li>Using <code>regUsage</code> and iterating around the flow - graph from the previous step, calculate, for each virtual - register, the set of flow edges on which it is live. - <p> - <li>Make a real-register committment map, which gives the set - of edges for which each real register is committed (in - use). These sets are initially empty. For each virtual - register, attempt to find a real register whose current - committment does not intersect that of the virtual - register -- ie, is uncommitted on all edges that the - virtual reg is live. If successful, this means the vreg - can be assigned to the realreg, so add the vreg's set to - the realreg's committment. - <p> - <li>If all the vregs were assigned to a realreg, use - <code>patchInstr</code> to apply the mapping to the insns themselves. - </ul> - <p> - <b>Spilling</b> - <p> - If <code>allocUsingTheseRegs</code> fails, a baroque - mechanism comes into play. We now know that much simpler - schemes are available to do the same thing and give better - results. - Anyways: - <p> - The logic above <code>allocUsingTheseRegs</code>, in - <code>doGeneralAlloc</code> and <code>runRegAllocate</code>, - observe that allocation has failed with some set R of real - registers. So they apply <code>runRegAllocate</code> a second - time to the code, but remove (typically) two registers from R - before doing so. This naturally fails too, but returns a - partially-allocated sequence. <code>doGeneralAlloc</code> - then inserts spill code into the sequence, and finally re-runs - <code>allocUsingTheseRegs</code>, but supplying the original, - unadulterated R. This is guaranteed to succeed since the two - registers previously removed from R are sufficient to allocate - all the spill/restore instructions added. - <p> - Because x86 is very short of registers, and in the worst case - needs three removed from R, a softly-softly approach is used. - <code>doGeneralAlloc</code> first tries with zero regs removed - from R, then if that fails one, then two, etc. This means - <code>allocUsingTheseRegs</code> may get run several times - before a successful arrangement is arrived at. - <code>findReservedRegs</code> cooks up the sets of spill - registers to try with. - <p> - The resulting machinery is complicated and the generated spill - code is appalling. The saving grace is that spills are very - rare so it doesn't matter much. I did not invent this -- I inherited it. - <p> - <b>Dealing with common cases fast</b> - <p> - The entire reg-alloc mechanism described so far is general and - correct, but expensive overkill for many simple code blocks. - So to begin with we use - <code>doSimpleAlloc</code>, which attempts to do something - simple. It exploits the observation that if the total number - of virtual registers does not exceed the number of real ones - available, we can simply dole out a new realreg each time we - see mention of a new vreg, with no regard for control flow. - <code>doSimpleAlloc</code> therefore attempts this in a - single pass over the code. It gives up if it runs out of real - regs or sees any condition which renders the above observation - invalid (fixed reg uses, for example). - <p> - This clever hack handles the majority of code blocks quickly. - It was copied from the previous reg-allocator (the - Mattson/Partain/Marlow/Gill one). - </ul> - -<p> -<h2>Complications, observations, and possible improvements</h2> - -<h3>Real vs virtual registers in the instruction selectors</h3> - -The instruction selectors for expression trees, namely -<code>getRegister</code>, are complicated by the fact that some -expressions can only be computed into a specific register, whereas -the majority can be computed into any register. We take x86 as an -example, but the problem applies to all archs. -<p> -Terminology: <em>rreg</em> means real register, a real machine -register. <em>vreg</em> means one of an infinite set of virtual -registers. The type <code>Reg</code> is the sum of <em>rreg</em> and -<em>vreg</em>. The instruction selector generates sequences with -unconstrained use of vregs, leaving the register allocator to map them -all into rregs. -<p> -Now, where was I ? Oh yes. We return to the type of -<code>getRegister</code>, which despite its name, selects instructions -to compute the value of an expression tree. -<pre> - getRegister :: StixExpr -> NatM Register - - data Register - = Fixed PrimRep Reg InstrBlock - | Any PrimRep (Reg -> InstrBlock) - - type InstrBlock -- sequence of instructions -</pre> -At first this looks eminently reasonable (apart from the stupid -name). <code>getRegister</code>, and nobody else, knows whether or -not a given expression has to be computed into a fixed rreg or can be -computed into any rreg or vreg. In the first case, it returns -<code>Fixed</code> and indicates which rreg the result is in. In the -second case it defers committing to any specific target register by -returning a function from <code>Reg</code> to <code>InstrBlock</code>, -and the caller can specify the target reg as it sees fit. -<p> -Unfortunately, that forces <code>getRegister</code>'s callers (usually -itself) to use a clumsy and confusing idiom in the common case where -they do not care what register the result winds up in. The reason is -that although a value might be computed into a fixed rreg, we are -forbidden (on pain of segmentation fault :) from subsequently -modifying the fixed reg. This and other rules are record in "Rules of -the game" inside <code>MachCode.lhs</code>. -<p> -Why can't fixed registers be modified post-hoc? Consider a simple -expression like <code>Hp+1</code>. Since the heap pointer -<code>Hp</code> is definitely in a fixed register, call it R, -<code>getRegister</code> on subterm <code>Hp</code> will simply return -<code>Fixed</code> with an empty sequence and R. But we can't just -emit an increment instruction for R, because that trashes -<code>Hp</code>; instead we first have to copy it into a fresh vreg -and increment that. -<p> -With all that in mind, consider now writing a <code>getRegister</code> -clause for terms of the form <code>(1 + E)</code>. Contrived, yes, -but illustrates the matter. First we do -<code>getRegister</code> on E. Now we are forced to examine what -comes back. -<pre> - getRegister (OnePlus e) - = getRegister e `thenNat` \ e_result -> - case e_result of - Fixed e_code e_fixed - -> returnNat (Any IntRep (\dst -> e_code ++ [MOV e_fixed dst, INC dst])) - Any e_any - -> Any (\dst -> e_any dst ++ [INC dst]) -</pre> -This seems unreasonably cumbersome, yet the instruction selector is -full of such idioms. A good example of the complexities induced by -this scheme is shown by <code>trivialCode</code> for x86 in -<code>MachCode.lhs</code>. This deals with general integer dyadic -operations on x86 and has numerous cases. It was difficult to get -right. -<p> -An alternative suggestion is to simplify the type of -<code>getRegister</code> to this: -<pre> - getRegister :: StixExpr -> NatM (InstrBloc, VReg) - type VReg = .... a vreg ... -</pre> -and then we could safely write -<pre> - getRegister (OnePlus e) - = getRegister e `thenNat` \ (e_code, e_vreg) -> - returnNat (e_code ++ [INC e_vreg], e_vreg) -</pre> -which is about as straightforward as you could hope for. -Unfortunately, it requires <code>getRegister</code> to insert moves of -values which naturally compute into an rreg, into a vreg. Consider: -<pre> - 1 + ccall some-C-fn -</pre> -On x86 the ccall result is returned in rreg <code>%eax</code>. The -resulting sequence, prior to register allocation, would be: -<pre> - # push args - call some-C-fn - # move %esp to nuke args - movl %eax, %vreg - incl %vreg -</pre> -If, as is likely, <code>%eax</code> is not held live beyond this point -for any other purpose, the move into a fresh register is pointless; -we'd have been better off leaving the value in <code>%eax</code> as -long as possible. -<p> -The simplified <code>getRegister</code> story is attractive. It would -clean up the instruction selectors significantly and make it simpler -to write new ones. The only drawback is that it generates redundant -register moves. I suggest that eliminating these should be the job -of the register allocator. Indeed: -<ul> -<li>There has been some work on this already ("Iterated register - coalescing" ?), so this isn't a new idea. -<p> -<li>You could argue that the existing scheme inappropriately blurs the - boundary between the instruction selector and the register - allocator. The instruction selector should .. well .. just - select instructions, without having to futz around worrying about - what kind of registers subtrees get generated into. Register - allocation should be <em>entirely</em> the domain of the register - allocator, with the proviso that it should endeavour to allocate - registers so as to minimise the number of non-redundant reg-reg - moves in the final output. -</ul> - - -<h3>Selecting insns for 64-bit values/loads/stores on 32-bit platforms</h3> - -Note that this stuff doesn't apply on 64-bit archs, since the -<code>getRegister</code> mechanism applies there. - -The relevant functions are: -<pre> - assignMem_I64Code :: StixExpr -> StixExpr -> NatM InstrBlock - assignReg_I64Code :: StixReg -> StixExpr -> NatM InstrBlock - iselExpr64 :: StixExpr -> NatM ChildCode64 - - data ChildCode64 -- a.k.a "Register64" - = ChildCode64 - InstrBlock -- code - VRegUnique -- unique for the lower 32-bit temporary -</pre> -<code>iselExpr64</code> is the 64-bit, plausibly-named analogue of -<code>getRegister</code>, and <code>ChildCode64</code> is the analogue -of <code>Register</code>. The aim here was to generate working 64 -bit code as simply as possible. To this end, I used the -simplified <code>getRegister</code> scheme described above, in which -<code>iselExpr64</code>generates its results into two vregs which -can always safely be modified afterwards. -<p> -Virtual registers are, unsurprisingly, distinguished by their -<code>Unique</code>s. There is a small difficulty in how to -know what the vreg for the upper 32 bits of a value is, given the vreg -for the lower 32 bits. The simple solution adopted is to say that -any low-32 vreg may also have a hi-32 counterpart which shares the -same unique, but is otherwise regarded as a separate entity. -<code>getHiVRegFromLo</code> gets one from the other. -<pre> - data VRegUnique - = VRegUniqueLo Unique -- lower part of a split quantity - | VRegUniqueHi Unique -- upper part thereof -</pre> -Apart from that, 64-bit code generation is really simple. The sparc -and x86 versions are almost copy-n-pastes of each other, with minor -adjustments for endianness. The generated code isn't wonderful but -is certainly acceptable, and it works. - - - -<h3>Shortcomings and inefficiencies in the register allocator</h3> - -<h4>Redundant reconstruction of the control flow graph</h4> - -The allocator goes to considerable computational expense to construct -all the flow edges in the group of instructions it's allocating for, -by using the <code>insnFuture</code> function in the -<code>Instr</code> pseudo-abstract type. -<p> -This is really silly, because all that information is present at the -abstract C stage, but is thrown away in the translation to Stix. -So a good thing to do is to modify that translation to -produce a directed graph of Stix straight-line code blocks, -and to preserve that structure through the insn selector, so the -allocator can see it. -<p> -This would eliminate the fragile, hacky, arch-specific -<code>insnFuture</code> mechanism, and probably make the whole -compiler run measurably faster. Register allocation is a fair chunk -of the time of non-optimising compilation (10% or more), and -reconstructing the flow graph is an expensive part of reg-alloc. -It would probably accelerate the vreg liveness computation too. - -<h4>Really ridiculous method for doing spilling</h4> - -This is a more ambitious suggestion, but ... reg-alloc should be -reimplemented, using the scheme described in "Quality and speed in -linear-scan register allocation." (Traub?) For straight-line code -blocks, this gives an elegant one-pass algorithm for assigning -registers and creating the minimal necessary spill code, without the -need for reserving spill registers ahead of time. -<p> -I tried it in Rigr, replacing the previous spiller which used the -current GHC scheme described above, and it cut the number of spill -loads and stores by a factor of eight. Not to mention being simpler, -easier to understand and very fast. -<p> -The Traub paper also describes how to extend their method to multiple -basic blocks, which will be needed for GHC. It comes down to -reconciling multiple vreg-to-rreg mappings at points where control -flow merges. - -<h4>Redundant-move support for revised instruction selector suggestion</h4> - -As mentioned above, simplifying the instruction selector will require -the register allocator to try and allocate source and destination -vregs to the same rreg in reg-reg moves, so as to make as many as -possible go away. Without that, the revised insn selector would -generate worse code than at present. I know this stuff has been done -but know nothing about it. The Linear-scan reg-alloc paper mentioned -above does indeed mention a bit about it in the context of single -basic blocks, but I don't know if that's sufficient. - - - -<h3>x86 arcana that you should know about</h3> - -The main difficulty with x86 is that many instructions have fixed -register constraints, which can occasionally make reg-alloc fail -completely. And the FPU doesn't have the flat register model which -the reg-alloc abstraction (implicitly) assumes. -<p> -Our strategy is: do a good job for the common small subset, that is -integer loads, stores, address calculations, basic ALU ops (+, -, -and, or, xor), and jumps. That covers the vast majority of -executed insns. And indeed we do a good job, with a loss of -less than 2% compared with gcc. -<p> -Initially we tried to handle integer instructions with awkward -register constraints (mul, div, shifts by non-constant amounts) via -various jigglings of the spiller et al. This never worked robustly, -and putting platform-specific tweaks in the generic infrastructure is -a big No-No. (Not quite true; shifts by a non-constant amount are -still done by a giant kludge, and should be moved into this new -framework.) -<p> -Fortunately, all such insns are rare. So the current scheme is to -pretend that they don't have any such constraints. This fiction is -carried all the way through the register allocator. When the insn -finally comes to be printed, we emit a sequence which copies the -operands through memory (<code>%esp</code>-relative), satisfying the -constraints of the real instruction. This localises the gruesomeness -to just one place. Here, for example, is the code generated for -integer divison of <code>%esi</code> by <code>%ecx</code>: -<pre> - # BEGIN IQUOT %ecx, %esi - pushl $0 - pushl %eax - pushl %edx - pushl %ecx - movl %esi,% eax - cltd - idivl 0(%esp) - movl %eax, 12(%esp) - popl %edx - popl %edx - popl %eax - popl %esi - # END IQUOT %ecx, %esi -</pre> -This is not quite as appalling as it seems, if you consider that the -division itself typically takes 16+ cycles, whereas the rest of the -insns probably go through in about 1 cycle each. -<p> -This trick is taken to extremes for FP operations. -<p> -All notions of the x86 FP stack and its insns have been removed. -Instead, we pretend, to the instruction selector and register -allocator, that x86 has six floating point registers, -<code>%fake0</code> .. <code>%fake5</code>, which can be used in the -usual flat manner. We further claim that x86 has floating point -instructions very similar to SPARC and Alpha, that is, a simple -3-operand register-register arrangement. Code generation and register -allocation proceed on this basis. -<p> -When we come to print out the final assembly, our convenient fiction -is converted to dismal reality. Each fake instruction is -independently converted to a series of real x86 instructions. -<code>%fake0</code> .. <code>%fake5</code> are mapped to -<code>%st(0)</code> .. <code>%st(5)</code>. To do reg-reg arithmetic -operations, the two operands are pushed onto the top of the FP stack, -the operation done, and the result copied back into the relevant -register. When one of the operands is also the destination, we emit a -slightly less scummy translation. There are only six -<code>%fake</code> registers because 2 are needed for the translation, -and x86 has 8 in total. -<p> -The translation is inefficient but is simple and it works. A cleverer -translation would handle a sequence of insns, simulating the FP stack -contents, would not impose a fixed mapping from <code>%fake</code> to -<code>%st</code> regs, and hopefully could avoid most of the redundant -reg-reg moves of the current translation. -<p> -There are, however, two unforeseen bad side effects: -<ul> -<li>This doesn't work properly, because it doesn't observe the normal - conventions for x86 FP code generation. It turns out that each of - the 8 elements in the x86 FP register stack has a tag bit which - indicates whether or not that register is notionally in use or - not. If you do a FPU operation which happens to read a - tagged-as-empty register, you get an x87 FPU (stack invalid) - exception, which is normally handled by the FPU without passing it - to the OS: the program keeps going, but the resulting FP values - are garbage. The OS can ask for the FPU to pass it FP - stack-invalid exceptions, but it usually doesn't. - <p> - Anyways: inside NCG created x86 FP code this all works fine. - However, the NCG's fiction of a flat register set does not operate - the x87 register stack in the required stack-like way. When - control returns to a gcc-generated world, the stack tag bits soon - cause stack exceptions, and thus garbage results. - <p> - The only fix I could think of -- and it is horrible -- is to clear - all the tag bits just before the next STG-level entry, in chunks - of code which use FP insns. <code>i386_insert_ffrees</code> - inserts the relevant <code>ffree</code> insns into such code - blocks. It depends critically on <code>is_G_instr</code> to - detect such blocks. -<p> -<li>It's very difficult to read the generated assembly and - reason about it when debugging, because there's so much clutter. - We print the fake insns as comments in the output, and that helps - a bit. -</ul> - - - -<h3>Generating code for ccalls</h3> - -For reasons I don't really understand, the instruction selectors for -generating calls to C (<code>genCCall</code>) have proven surprisingly -difficult to get right, and soaked up a lot of debugging time. As a -result, I have once again opted for schemes which are simple and not -too difficult to argue as correct, even if they don't generate -excellent code. -<p> -The sparc ccall generator in particular forces all arguments into -temporary virtual registers before moving them to the final -out-registers (<code>%o0</code> .. <code>%o5</code>). This creates -some unnecessary reg-reg moves. The reason is explained in a -comment in the code. - - -<h3>Duplicate implementation for many STG macros</h3> - -This has been discussed at length already. It has caused a couple of -nasty bugs due to subtle untracked divergence in the macro -translations. The macro-expander really should be pushed up into the -Abstract C phase, so the problem can't happen. -<p> -Doing so would have the added benefit that the NCG could be used to -compile more "ways" -- well, at least the 'p' profiling way. - - -<h3>How to debug the NCG without losing your sanity/hair/cool</h3> - -Last, but definitely not least ... -<p> -The usual syndrome is that some program, when compiled via C, works, -but not when compiled via the NCG. Usually the problem is fairly -simple to fix, once you find the specific code block which has been -mistranslated. But the latter can be nearly impossible, since most -modules generate at least hundreds and often thousands of them. -<p> -My solution: cheat. -<p> -Because the via-C and native routes diverge only late in the day, -it is not difficult to construct a 1-1 correspondence between basic -blocks on the two routes. So, if the program works via C but not on -the NCG, do the following: -<ul> -<li>Recompile <code>AsmCodeGen.lhs</code> in the afflicted compiler - with <code>-DDEBUG_NCG</code>, so that it inserts - <code>___ncg_debug_marker</code>s - into the assembly it emits. -<p> -<li>Using a binary search on modules, find the module which is causing - the problem. -<p> -<li>Compile that module to assembly code, with identical flags, twice, - once via C and once via NCG. - Call the outputs <code>ModuleName.s-gcc</code> and - <code>ModuleName.s-nat</code>. Check that the latter does indeed have - <code>___ncg_debug_marker</code>s in it; otherwise the next steps fail. -<p> -<li>Build (with a working compiler) the program - <code>fptools/ghc/utils/debugNCG/diff_gcc_nat</code>. -<p> -<li>Run: <code>diff_gcc_nat ModuleName.s</code>. This will - construct the 1-1 correspondence, and emits on stdout - a cppable assembly output. Place this in a file -- I always - call it <code>synth.S</code>. Note, the capital S is important; - otherwise it won't get cpp'd. You can feed this file directly to - ghc and it will automatically get cpp'd; you don't have to do so - yourself. -<p> -<li>By messing with the <code>#define</code>s at the top of - <code>synth.S</code>, do a binary search to find the incorrect - block. Keep a careful record of where you are in the search; it - is easy to get confused. Remember also that multiple blocks may - be wrong, which also confuses matters. Finally, I usually start - off by re-checking that I can build the executable with all the - <code>#define</code>s set to 0 and then all to 1. This ensures - you won't get halfway through the search and then get stuck due to - some snafu with gcc-specific literals. Usually I set - <code>UNMATCHED_GCC</code> to 1 all the time, and this bit should - contain only literal data. - <code>UNMATCHED_NAT</code> should be empty. -</ul> -<p> -<code>diff_gcc_nat</code> was known to work correctly last time I used -it, in December 01, for both x86 and sparc. If it doesn't work, due -to changes in assembly syntax, or whatever, make it work. The -investment is well worth it. Searching for the incorrect block(s) any -other way is a total time waster. - - - -</ul> - - - - - <p><small> -<!-- hhmts start --> -Last modified: Mon Aug 19 11:41:43 CEST 2013 -<!-- hhmts end --> - </small> - </body> -</html> diff --git a/docs/comm/the-beast/optimistic.html b/docs/comm/the-beast/optimistic.html deleted file mode 100644 index 4d158022e8..0000000000 --- a/docs/comm/the-beast/optimistic.html +++ /dev/null @@ -1,65 +0,0 @@ -<h2> Architectural stuff </h2> - -New fields in the TSO: -<ul> -<li> New global speculation-depth register; always counts the number of specuation frames -on the stack; incremented when -starting speculation, decremented when finishing. -<li> Profiling stuff -</ul> - - -<h2> Speculation frames </h2> - -The info table for a speculation frame points to the static spec-depth configuration -for that speculation point. (Points to, because the config is mutable, and the info -table has to be adjacent to the (immutable) code.) - - - -<h2> Abortion</h2> - -Abortion is modelled by a special asynchronous exception ThreadAbort. - -<ul> -<li> In the scheduler, if a thread returns with ThreadBlocked, and non-zero SpecDepth, send it -an asynchronous exception. - -<li> In the implementation of the <tt>catch#</tt> primop, raise an asynchonous exception if -SpecDepth is nonzero. - -<li> Timeout, administered by scheduler. Current story: abort if a speculation frame lasts from -one minor GC to the next. We detect this by seeing if there's a profiling frame on the stack --- a -profiling frame is added at a minor GC in place of a speculation frame (see Online Profiling). -</ul> - - -When tearing frames off the stack, we start a new chunk at every speculation frame, as well as every -update frame. We proceed down to the deepest speculation frame. -<p> -The <tt>AP_STACK</tt> closure built for a speculation frame must be careful <em>not</em> to enter the -next <tt>AP_STACK</tt> closure up, because that would re-enter a possible loop. -<p> -Delivering an asynch exception to a thread that is speculating. Invariant: there can be no catch frames -inside speculation (we abort in <tt>catch#</tt> when speculating. So the asynch exception just -tears off frames in the standard way until it gets to a catch frame, just as it would usually do. -<p> -Abortion can punish one or more of the speculation frames by decrementing their static config variables. - -<h3>Synchronous exceptions</h3> - -Synchronous exceptions are treated similarly as before. The stack is discarded up to an update frame; the -thunk to be updated is overwritten with "raise x", and the process continues. Until a catch frame. -<p> -When we find a spec frame, we allocate a "raise x" object, and resume execution with the return address -in the spec frame. In that way the spec frame is like a catch frame; it stops the unwinding process. -<p> -It's essential that every hard failure is caught, else speculation is unsafe. In particular, divide by zero -is hard to catch using OS support, so we test explicitly in library code. You can shoot yourself in the foot -by writing <tt>x `div#` 0</tt>, side-stepping the test. - - -<h3> Online profiling </h3> - -Sampling can be more frequent than minor GC (by jiggling the end-of-block code) but cannot -be less frequent, because GC doesn't expect to see profiling frames.
\ No newline at end of file diff --git a/docs/comm/the-beast/prelude.html b/docs/comm/the-beast/prelude.html deleted file mode 100644 index 64b607def5..0000000000 --- a/docs/comm/the-beast/prelude.html +++ /dev/null @@ -1,207 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - Primitives and the Prelude</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>The GHC Commentary - Primitives and the Prelude</h1> - <p> - One of the trickiest aspects of GHC is the delicate interplay - between what knowledge is baked into the compiler, and what - knowledge it gets by reading the interface files of library - modules. In general, the less that is baked in, the better. -<p> - Most of what the compiler has to have wired in about primitives and - prelude definitions is in - <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/"><code>fptools/ghc/compiler/prelude/</code></a>. - </p> - -GHC recognises these main classes of baked-in-ness: -<dl> -<dt><strong>Primitive types.</strong> -<dd>Primitive types cannot be defined in Haskell, and are utterly baked into the compiler. -They are notionally defined in the fictional module <tt>GHC.Prim</tt>. The <tt>TyCon</tt>s for these types are all defined -in module <tt>TysPrim</tt>; for example, -<pre> - intPrimTyCon :: TyCon - intPrimTyCon = .... -</pre> -Examples: -<tt>Int#, Float#, Addr#, State#</tt>. -<p> -<dt><strong>Wired-in types.</strong> -<dd>Wired-in types can be defined in Haskell, and indeed are (many are defined in </tt>GHC.Base</tt>). -However, it's very convenient for GHC to be able to use the type constructor for (say) <tt>Int</tt> -without looking it up in any environment. So module <tt>TysWiredIn</tt> contains many definitions -like this one: -<pre> - intTyCon :: TyCon - intTyCon = .... - - intDataCon :: DataCon - intDataCon = .... -</pre> -However, since a <tt>TyCon</tt> value contains the entire type definition inside it, it follows -that the complete definition of <tt>Int</tt> is thereby baked into the compiler. -<p> -Nevertheless, the library module <tt>GHC.Base</tt> still contains a definition for <tt>Int</tt> -just so that its info table etc get generated somewhere. Chaos will result if the wired-in definition -in <tt>TysWiredIn</tt> differs from that in <tt>GHC.Base</tt>. -<p> -The rule is that only very simple types should be wired in (for example, <tt>Ratio</tt> is not, -and <tt>IO</tt> is certainly not). No class is wired in: classes are just too complicated. -<p> -Examples: <tt>Int</tt>, <tt>Float</tt>, <tt>List</tt>, tuples. - -<p> -<dt><strong>Known-key things.</strong> -<dd>GHC knows of the existence of many, many other types, classes and values. <em>But all it knows is -their <tt>Name</tt>.</em> Remember, a <tt>Name</tt> includes a unique key that identifies the -thing, plus its defining module and occurrence name -(see <a href="names.html">The truth about Names</a>). Knowing a <tt>Name</tt>, therefore, GHC can -run off to the interface file for the module and find out everything else it might need. -<p> -Most of these known-key names are defined in module <tt>PrelNames</tt>; a further swathe concerning -Template Haskell are defined in <tt>DsMeta</tt>. The allocation of unique keys is done manually; -chaotic things happen if you make a mistake here, which is why they are all together. -</dl> - -All the <tt>Name</tt>s from all the above categories are used to initialise the global name cache, -which maps (module,occurrence-name) pairs to the globally-unique <tt>Name</tt> for that -thing. (See <tt>HscMain.initOrigNames</tt>.) - -<p> -The next sections elaborate these three classes a bit. - - - <h2>Primitives (module <tt>TysPrim</tt>)</h2> - <p> - Some types and functions have to be hardwired into the compiler as they - are atomic; all other code is essentially built around this primitive - functionality. This includes basic arithmetic types, such as integers, - and their elementary operations as well as pointer types. Primitive - types and functions often receive special treatment in the code - generator, which means that these entities have to be explicitly - represented in the compiler. Moreover, many of these types receive some - explicit treatment in the runtime system, and so, there is some further - information about <a href="../rts-libs/primitives.html">primitives in - the RTS section</a> of this document. - <p> - The module <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/TysPrim.lhs"><code>TysPrim</code></a> - exports a list of all primitive type constructors as <code>primTyCons :: - [TyCon]</code>. All of these type constructors (of type - <code>TyCon</code>) are also exported as <code>intPrimTyCon</code>, - <code>stablePtrPrimTyCon</code>, and so on. In addition, for each - nullary type constructor the corresponding type (of type - <code>Type</code>) is also exported; for example, we have - <code>intPrimTy :: Type</code>. For all other type constructors, a - function is exported that constructs the type obtained by applying the - type constructors to an argument type (of type <code>Type</code>); for - example, we have <code>mkStablePtrPrimTy :: Type -> Type</code>. - <p> - As it is inconvenient to identify type that receive a special treatment - by the code generator by looking at their name, the module <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/PrimRep.lhs"><code>PrimRep</code></a> - exports a data type <code>PrimRep</code>, which lists all - machine-manipulable implementation types. The module also exports a set - of query functions on <code>PrimRep</code> that define properties, such - as a type's byte size or whether a primitive type is a pointer type. - Moreover, the function <code>TysPrim.primRepTyCon :: PrimRep -> - TyCon</code> converts <code>PrimRep</code> values into the corresponding - type constructor. - - <h2>Wired in types (module <tt>TysWiredIn</tt>)</h2> - <p> - In addition to entities that are primitive, as the compiler has to treat - them specially in the backend, there is a set of types, functions, - etc. that the Haskell language definition flags as essential to the - language by placing them into the special module <code>Prelude</code> - that is implicitly imported into each Haskell module. For some of these - entities it suffices to define them (by standard Haskell definitions) in - a <code>Prelude</code> module and ensuring that this module is treated - specially by being always imported . - <p> - However, there is a set of entities (such as, for example, the list type - and the corresponding data constructors) that have an inbetween status: - They are not truly primitive (lists, for example, can easily be defined - by a <code>data</code> declaration), but the compiler has to have extra - knowledge about them, as they are associated with some particular - features of the language (in the case of lists, there is special syntax, - such as list comprehensions, associated with the type). Another - example, for a special kind of entity are type classes that can be used - in a <code>deriving</code> clause. All types that are not-primitive, - but about which the compiler nonetheless has to have some extra - knowledge are defined in the module <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/TysWiredIn.lhs"><code>TysWiredIn</code></a>. - <p> - All wired in type constructors are contained in <code>wiredInTyCons :: - [TyCon]</code>. In addition to that list, <code>TysWiredIn</code> - exports variables bound to representations of all listed type - constructors and their data constructors. So, for example, we have - <code>listTyCon</code> together with <code>nilDataCon</cons> and - </code>consDataCon</code>. There are also convenience functions, such - as <code>mkListTy</code> and <code>mkTupleTy</code>, which construct - compound types. - <p> - - <h2>Known-key names (module <tt>PrelNames</tt>)</h2> - - All names of types, functions, etc. known to the compiler are defined in - <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/PrelNames.lhs"><code>PrelNames</code></a>. - This includes the names of types and functions exported from - <code>TysWiredIn</code>, but also others. In particular, this module - also fixes the names of all prelude modules; i.e., of the modules whose - name starts with <code>Prel</code>, which GHC's library uses to bring - some structure into the quite large number of <code>Prelude</code> - definitions. - <p> - <code>PrelNames.knownKeyNames :: [Name]</code> contains all names known - to the compiler, but the elements of the list are also exported - individually as variables, such as <code>floatTyConName</code> (having - the lexeme <code>Float</code>) and <code>floatDataConName</code> (having - the lexeme <code>F#</code>). For each of these names, - <code>PrelNames</code> derfines a unique key with a definition, such as - <p> -<blockquote><pre> -floatPrimTyConKey = mkPreludeTyConUnique 11</pre> -</blockquote> - <p> - that is, all unique keys for known prelude names are hardcoded into - <code>PrelNames</code> (and uniqueness has to be manually ensured in - that module). To simplify matching the types of important groups of - type constructors, <code>PrelNames</code> also exports lists, such as - <code>numericTyKeys</code> (keys of all numeric types), that contain the - unique keys of all names in that group. In addition, derivable type - classes and their structure is defined by - <code>derivableClassKeys</code> and related definitions. - <p> - In addition to names that have unique keys, <code>PrelNames</code> also - defines a set of names without uniqueness information. These names end - on the suffix <code>_RDR</code> and are of type <code>RdrName</code> (an - example, is <code>times_RDR</code>, which represents the lexeme - <code>*</code>). The names are used in locations where they pass - through the renamer anyway (e.g., special constructors encountered by - the parser, such as [], and code generated from deriving clauses), which - will take care of adding uniqueness information. - <p> - -<h2>Gathering it all together (module <tt>PrelInfo</tt>)</h2> - The module - <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/PrelInfo.lhs"><code>PrelInfo</code></a> - in some sense ties all the above together and provides a reasonably - restricted interface to these definition to the rest of the compiler. - However, from what I have seen, this doesn't quite work out and the - earlier mentioned modules are directly imported in many places. - - <p><small> -<!-- hhmts start --> -Last modified: Tue Dec 11 17:54:07 EST 2001 -<!-- hhmts end --> - </small> - </body> -</html> diff --git a/docs/comm/the-beast/renamer.html b/docs/comm/the-beast/renamer.html deleted file mode 100644 index 878e82b370..0000000000 --- a/docs/comm/the-beast/renamer.html +++ /dev/null @@ -1,249 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - The Glorious Renamer</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>The GHC Commentary - The Glorious Renamer</h1> - <p> - The <em>renamer</em> sits between the parser and the typechecker. - However, its operation is quite tightly interwoven with the - typechecker. This is partially due to support for Template Haskell, - where spliced code has to be renamed and type checked. In particular, - top-level splices lead to multiple rounds of renaming and type - checking. - </p> - <p> - The main externally used functions of the renamer are provided by the - module <code>rename/RnSource.lhs</code>. In particular, we have - </p> - <blockquote> - <pre> -rnSrcDecls :: HsGroup RdrName -> RnM (TcGblEnv, HsGroup Name) -rnTyClDecls :: [LTyClDecl RdrName] -> RnM [LTyClDecl Name] -rnSplice :: HsSplice RdrName -> RnM (HsSplice Name, FreeVars)</pre> - </blockquote> - <p> - All of which execute in the renamer monad <code>RnM</code>. The first - function, <code>rnSrcDecls</code> renames a binding group; the second, - <code>rnTyClDecls</code> renames a list of (toplevel) type and class - declarations; and the third, <code>rnSplice</code> renames a Template - Haskell splice. As the types indicate, the main task of the renamer is - to convert converts all the <tt>RdrNames</tt> to <a - href="names.html"><tt>Names</tt></a>, which includes a number of - well-formedness checks (no duplicate declarations, all names are in - scope, and so on). In addition, the renamer performs other, not - strictly name-related, well-formedness checks, which includes checking - that the appropriate flags have been supplied whenever language - extensions are used in the source. - </p> - - <h2>RdrNames</h2> - <p> - A <tt>RdrName.RdrName</tt> is pretty much just a string (for an - unqualified name like "<tt>f</tt>") or a pair of strings (for a - qualified name like "<tt>M.f</tt>"): - </p> - <blockquote> - <pre> -data RdrName - = Unqual OccName - -- Used for ordinary, unqualified occurrences - - | Qual Module OccName - -- A qualified name written by the user in - -- *source* code. The module isn't necessarily - -- the module where the thing is defined; - -- just the one from which it is imported - - | Orig Module OccName - -- An original name; the module is the *defining* module. - -- This is used when GHC generates code that will be fed - -- into the renamer (e.g. from deriving clauses), but where - -- we want to say "Use Prelude.map dammit". - - | Exact Name - -- We know exactly the Name. This is used - -- (a) when the parser parses built-in syntax like "[]" - -- and "(,)", but wants a RdrName from it - -- (b) when converting names to the RdrNames in IfaceTypes - -- Here an Exact RdrName always contains an External Name - -- (Internal Names are converted to simple Unquals) - -- (c) by Template Haskell, when TH has generated a unique name</pre> - </blockquote> - <p> - The OccName type is described in <a href="names.html#occname">The - truth about names</a>. - </p> - - <h2>The Renamer Monad</h2> - <p> - Due to the tight integration of the renamer with the typechecker, both - use the same monad in recent versions of GHC. So, we have - </p> - <blockquote> - <pre> -type RnM a = TcRn a -- Historical -type TcM a = TcRn a -- Historical</pre> - </blockquote> - <p> - with the combined monad defined as - </p> - <blockquote> - <pre> -type TcRn a = TcRnIf TcGblEnv TcLclEnv a -type TcRnIf a b c = IOEnv (Env a b) c - -data Env gbl lcl -- Changes as we move into an expression - = Env { - env_top :: HscEnv, -- Top-level stuff that never changes - -- Includes all info about imported things - - env_us :: TcRef UniqSupply, -- Unique supply for local varibles - - env_gbl :: gbl, -- Info about things defined at the top level - -- of the module being compiled - - env_lcl :: lcl -- Nested stuff; changes as we go into - -- an expression - }</pre> - </blockquote> - <p> - the details of the global environment type <code>TcGblEnv</code> and - local environment type <code>TcLclEnv</code> are also defined in the - module <code>typecheck/TcRnTypes.lhs</code>. The monad - <code>IOEnv</code> is defined in <code>utils/IOEnv.hs</code> and extends - the vanilla <code>IO</code> monad with an additional state parameter - <code>env</code> that is treated as in a reader monad. (Side effecting - operations, such as updating the unique supply, are done with - <code>TcRef</code>s, which are simply a synonym for <code>IORef</code>s.) - </p> - - <h2>Name Space Management</h2> - <p> - As anticipated by the variants <code>Orig</code> and <code>Exact</code> - of <code>RdrName</code> some names should not change during renaming, - whereas others need to be turned into unique names. In this context, - the two functions <code>RnEnv.newTopSrcBinder</code> and - <code>RnEnv.newLocals</code> are important: - </p> - <blockquote> - <pre> -newTopSrcBinder :: Module -> Maybe Name -> Located RdrName -> RnM Name -newLocalsRn :: [Located RdrName] -> RnM [Name]</pre> - </blockquote> - <p> - The two functions introduces new toplevel and new local names, - respectively, where the first two arguments to - <code>newTopSrcBinder</code> determine the currently compiled module and - the parent construct of the newly defined name. Both functions create - new names only for <code>RdrName</code>s that are neither exact nor - original. - </p> - - <h3>Introduction of Toplevel Names: Global RdrName Environment</h3> - <p> - A global <code>RdrName</code> environment - <code>RdrName.GlobalRdrEnv</code> is a map from <code>OccName</code>s to - lists of qualified names. More precisely, the latter are - <code>Name</code>s with an associated <code>Provenance</code>: - </p> - <blockquote> - <pre> -data Provenance - = LocalDef -- Defined locally - Module - - | Imported -- Imported - [ImportSpec] -- INVARIANT: non-empty - Bool -- True iff the thing was named *explicitly* - -- in *any* of the import specs rather than being - -- imported as part of a group; - -- e.g. - -- import B - -- import C( T(..) ) - -- Here, everything imported by B, and the constructors of T - -- are not named explicitly; only T is named explicitly. - -- This info is used when warning of unused names.</pre> - </blockquote> - <p> - The part of the global <code>RdrName</code> environment for a module - that contains the local definitions is created by the function - <code>RnNames.importsFromLocalDecls</code>, which also computes a data - structure recording all imported declarations in the form of a value of - type <code>TcRnTypes.ImportAvails</code>. - </p> - <p> - The function <code>importsFromLocalDecls</code>, in turn, makes use of - <code>RnNames.getLocalDeclBinders :: Module -> HsGroup RdrName -> RnM - [AvailInfo]</code> to extract all declared names from a binding group, - where <code>HscTypes.AvailInfo</code> is essentially a collection of - <code>Name</code>s; i.e., <code>getLocalDeclBinders</code>, on the fly, - generates <code>Name</code>s from the <code>RdrName</code>s of all - top-level binders of the module represented by the <code>HsGroup - RdrName</code> argument. - </p> - <p> - It is important to note that all this happens before the renamer - actually descends into the toplevel bindings of a module. In other - words, before <code>TcRnDriver.rnTopSrcDecls</code> performs the - renaming of a module by way of <code>RnSource.rnSrcDecls</code>, it uses - <code>importsFromLocalDecls</code> to set up the global - <code>RdrName</code> environment, which contains <code>Name</code>s for - all imported <em>and</em> all locally defined toplevel binders. Hence, - when the helpers of <code>rnSrcDecls</code> come across the - <em>defining</em> occurrences of a toplevel <code>RdrName</code>, they - don't rename it by generating a new name, but they simply look up its - name in the global <code>RdrName</code> environment. - </p> - - <h2>Rebindable syntax</h2> - <p> - In Haskell when one writes "3" one gets "fromInteger 3", where - "fromInteger" comes from the Prelude (regardless of whether the - Prelude is in scope). If you want to completely redefine numbers, - that becomes inconvenient. So GHC lets you say - "-fno-implicit-prelude"; in that case, the "fromInteger" comes from - whatever is in scope. (This is documented in the User Guide.) - </p> - <p> - This feature is implemented as follows (I always forget). - <ul> - <li>Names that are implicitly bound by the Prelude, are marked by the - type <code>HsExpr.SyntaxExpr</code>. Moreover, the association list - <code>HsExpr.SyntaxTable</code> is set up by the renamer to map - rebindable names to the value they are bound to. - </li> - <li>Currently, five constructs related to numerals - (<code>HsExpr.NegApp</code>, <code>HsPat.NPat</code>, - <code>HsPat.NPlusKPat</code>, <code>HsLit.HsIntegral</code>, and - <code>HsLit.HsFractional</code>) and - two constructs related to code>do</code> expressions - (<code>HsExpr.BindStmt</code> and - <code>HsExpr.ExprStmt</code>) have rebindable syntax. - </li> - <li> When the parser builds these constructs, it puts in the - built-in Prelude Name (e.g. PrelNum.fromInteger). - </li> - <li> When the renamer encounters these constructs, it calls - <tt>RnEnv.lookupSyntaxName</tt>. - This checks for <tt>-fno-implicit-prelude</tt>; if not, it just - returns the same Name; otherwise it takes the occurrence name of the - Name, turns it into an unqualified RdrName, and looks it up in the - environment. The returned name is plugged back into the construct. - </li> - <li> The typechecker uses the Name to generate the appropriate typing - constraints. - </li> - </ul> - - <p><small> -<!-- hhmts start --> -Last modified: Wed May 4 17:16:15 EST 2005 -<!-- hhmts end --> - </small> - </body> -</html> - diff --git a/docs/comm/the-beast/simplifier.html b/docs/comm/the-beast/simplifier.html deleted file mode 100644 index 4dbce7765b..0000000000 --- a/docs/comm/the-beast/simplifier.html +++ /dev/null @@ -1,86 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - The Mighty Simplifier</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>The GHC Commentary - The Mighty Simplifier</h1> - <p> - Most of the optimising program transformations applied by GHC are - performed on an intermediate language called <em>Core,</em> which - essentially is a compiler-friendly formulation of rank-2 polymorphic - lambda terms defined in the module <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/coreSyn/CoreSyn.lhs/"><code>CoreSyn.lhs</code>.</a> - The transformation engine optimising Core programs is called the - <em>Simplifier</em> and composed from a couple of modules located in the - directory <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/simplCore/"><code>fptools/ghc/compiler/simplCore/</code>.</a> - The main engine of the simplifier is contained in <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/simplCore/Simplify.lhs"><code>Simplify.lhs</code>.</a> - and its driver is the routine <code>core2core</code> in <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/simplCore/SimplCore.lhs"><code>SimplCore.lhs</code>.</a> - <p> - The program that the simplifier has produced after applying its various - optimisations can be obtained by passing the option - <code>-ddump-simpl</code> to GHC. Moreover, the various intermediate - stages of the optimisation process is printed when passing - <code>-dverbose-core2core</code>. - - <h4><a name="loopBreaker">Recursive Definitions</a></h4> - <p> - The simplification process has to take special care when handling - recursive binding groups; otherwise, the compiler might loop. - Therefore, the routine <code>reOrderRec</code> in <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/simplCore/OccurAnal.lhs"><code>OccurAnal.lhs</code></a> - computes a set of <em>loop breakers</em> - a set of definitions that - together cut any possible loop in the binding group. It marks the - identifiers bound by these definitions as loop breakers by enriching - their <a href="basicTypes.html#occInfo">occurrence information.</a> Loop - breakers will <em>never</em> be inlined by the simplifier; thus, - guaranteeing termination of the simplification procedure. (This is not - entirely accurate -- see <a href="#rules">rewrite rules</a> below.) - - The processes finding loop breakers works as follows: First, the - strongly connected components (SCC) of the graph representing all - function dependencies is computed. Then, each SCC is inspected in turn. - If it contains only a single binding (self-recursive function), this is - the loop breaker. In case of multiple recursive bindings, the function - attempts to select bindings where the decision not to inline them does - cause the least harm - in the sense of inhibiting optimisations in the - code. This is achieved by considering each binding in turn and awarding - a <em>score</em> between 0 and 4, where a lower score means that the - function is less useful for inlining - and thus, a better loop breaker. - The evaluation of bingings is performed by the function - <code>score</code> locally defined in <code>OccurAnal</code>. - - Note that, because core programs represent function definitions as - <em>one</em> binding choosing between the possibly many equations in the - source program with a <code>case</code> construct, a loop breaker cannot - inline any of its possibly many alternatives (not even the non-recursive - alternatives). - - <h4><a name="rules">Rewrite Rules</a></h4> - <p> - The application of rewrite rules is controlled in the module <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/simplCore/Simplify.lhs"><code>Simplify.lhs</code></a> - by the function <code>completeCall</code>. This function first checks - whether it should inline the function applied at the currently inspected - call site, then simplifies the arguments, and finally, checks whether - any rewrite rule can be applied (and also whether there is a matching - specialised version of the applied function). The actual check for rule - application is performed by the function <code><a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/specialise/Rules.lhs">Rules</a>.lookupRule</code>. - <p> - It should be note that the application of rewrite rules is not subject - to the loop breaker check - i.e., rules of loop breakers will be applied - regardless of whether this may cause the simplifier to diverge. - - <p><small> -<!-- hhmts start --> -Last modified: Wed Aug 8 19:25:33 EST 2001 -<!-- hhmts end --> - </small> - </body> -</html> diff --git a/docs/comm/the-beast/stg.html b/docs/comm/the-beast/stg.html deleted file mode 100644 index 6c9851623a..0000000000 --- a/docs/comm/the-beast/stg.html +++ /dev/null @@ -1,164 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - You Got Control: The STG-language</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>The GHC Commentary - You Got Control: The STG-language</h1> - <p> - GHC contains two completely independent backends: the byte code - generator and the machine code generator. The decision over which of - the two is invoked is made in <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/HscMain.lhs"><code>HscMain</code></a><code>.hscCodeGen</code>. - The machine code generator proceeds itself in a number of phases: First, - the <a href="desugar.html">Core</a> intermediate language is translated - into <em>STG-language</em>; second, STG-language is transformed into a - GHC-internal variant of <a href="http://www.cminusminus.org/">C--</a>; - and thirdly, this is either emitted as concrete C--, converted to GNU C, - or translated to native code (by the <a href="ncg.html">native code - generator</a> which targets IA32, Sparc, and PowerPC [as of March '5]). - </p> - <p> - In the following, we will have a look at the first step of machine code - generation, namely the translation steps involving the STG-language. - Details about the underlying abstract machine, the <em>Spineless Tagless - G-machine</em>, are in <a - href="http://research.microsoft.com/copyright/accept.asp?path=/users/simonpj/papers/spineless-tagless-gmachine.ps.gz&pub=34">Implementing - lazy functional languages on stock hardware: the Spineless Tagless - G-machine</a>, SL Peyton Jones, Journal of Functional Programming 2(2), - Apr 1992, pp127-202. (Some details have changed since the publication of - this article, but it still gives a good introduction to the main - concepts.) - </p> - - <h2>The STG Language</h2> - <p> - The AST of the STG-language and the generation of STG code from Core is - both located in the <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/"><code>stgSyn/</code></a> - directory; in the modules <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/StgSyn.lhs"><code>StgSyn</code></a> - and <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/CoreToStg.lhs"><code>CoreToStg</code></a>, - respectively. - </p> - <p> - Conceptually, the STG-language is a lambda calculus (including data - constructors and case expressions) whose syntax is restricted to make - all control flow explicit. As such, it can be regarded as a variant of - <em>administrative normal form (ANF).</em> (C.f., <a - href="http://doi.acm.org/10.1145/173262.155113">The essence of compiling - with continuations.</a> Cormac Flanagan, Amr Sabry, Bruce F. Duba, and - Matthias Felleisen. <em>ACM SIGPLAN Conference on Programming Language - Design and Implementation,</em> ACM Press, 1993.) Each syntactic from - has a precise operational interpretation, in addition to the - denotational interpretation inherited from the lambda calculus. The - concrete representation of the STG language inside GHC also includes - auxiliary attributes, such as <em>static reference tables (SRTs),</em> - which determine the top-level bindings referenced by each let binding - and case expression. - </p> - <p> - As usual in ANF, arguments to functions etc. are restricted to atoms - (i.e., constants or variables), which implies that all sub-expressions - are explicitly named and evaluation order is explicit. Specific to the - STG language is that all let bindings correspond to closure allocation - (thunks, function closures, and data constructors) and that case - expressions encode both computation and case selection. There are two - flavours of case expressions scrutinising boxed and unboxed values, - respectively. The former perform function calls including demanding the - evaluation of thunks, whereas the latter execute primitive operations - (such as arithmetic on fixed size integers and floating-point numbers). - </p> - <p> - The representation of STG language defined in <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/StgSyn.lhs"><code>StgSyn</code></a> - abstracts over both binders and occurrences of variables. The type names - involved in this generic definition all carry the prefix - <code>Gen</code> (such as in <code>GenStgBinding</code>). Instances of - these generic definitions, where both binders and occurrences are of type - <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/Id.lhs"><code>Id</code></a><code>.Id</code> - are defined as type synonyms and use type names that drop the - <code>Gen</code> prefix (i.e., becoming plain <code>StgBinding</code>). - Complete programs in STG form are represented by values of type - <code>[StgBinding]</code>. - </p> - - <h2>From Core to STG</h2> - <p> - Although, the actual translation from Core AST into STG AST is performed - by the function <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/CoreToStg.lhs"><code>CoreToStg</code></a><code>.coreToStg</code> - (or <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/CoreToStg.lhs"><code>CoreToStg</code></a><code>.coreExprToStg</code> - for individual expressions), the translation crucial depends on <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/coreSyn/CorePrep.lhs"><code>CorePrep</code></a><code>.corePrepPgm</code> - (resp. <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/coreSyn/CorePrep.lhs"><code>CorePrep</code></a><code>.corePrepExpr</code>), - which prepares Core code for code generation (for both byte code and - machine code generation). <code>CorePrep</code> saturates primitive and - constructor applications, turns the code into A-normal form, renames all - identifiers into globally unique names, generates bindings for - constructor workers, constructor wrappers, and record selectors plus - some further cleanup. - </p> - <p> - In other words, after Core code is prepared for code generation it is - structurally already in the form required by the STG language. The main - work performed by the actual transformation from Core to STG, as - performed by <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/CoreToStg.lhs"><code>CoreToStg</code></a><code>.coreToStg</code>, - is to compute the live and free variables as well as live CAFs (constant - applicative forms) at each let binding and case alternative. In - subsequent phases, the live CAF information is used to compute SRTs. - The live variable information is used to determine which stack slots - need to be zapped (to avoid space leaks) and the free variable - information is need to construct closures. Moreover, hints for - optimised code generation are computed, such as whether a closure needs - to be updated after is has been evaluated. - </p> - - <h2>STG Passes</h2> - <p> - These days little actual work is performed on programs in STG form; in - particular, the code is not further optimised. All serious optimisation - (except low-level optimisations which are performed during native code - generation) has already been done on Core. The main task of <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/CoreToStg.lhs"><code>CoreToStg</code></a><code>.stg2stg</code> - is to compute SRTs from the live CAF information determined during STG - generation. Other than that, <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/profiling/SCCfinal.lhs"><code>SCCfinal</code></a><code>.stgMassageForProfiling</code> - is executed when compiling for profiling and information may be dumped - for debugging purposes. - </p> - - <h2>Towards C--</h2> - <p> - GHC's internal form of C-- is defined in the module <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/cmm/Cmm.hs"><code>Cmm</code></a>. - The definition is generic in that it abstracts over the type of static - data and of the contents of basic blocks (i.e., over the concrete - representation of constant data and instructions). These generic - definitions have names carrying the prefix <code>Gen</code> (such as - <code>GenCmm</code>). The same module also instantiates the generic - form to a concrete form where data is represented by - <code>CmmStatic</code> and instructions are represented by - <code>CmmStmt</code> (giving us, e.g., <code>Cmm</code> from - <code>GenCmm</code>). The concrete form more or less follows the - external <a href="http://www.cminusminus.org/">C--</a> language. - </p> - <p> - Programs in STG form are translated to <code>Cmm</code> by <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/codeGen/CodeGen.lhs"><code>CodeGen</code></a><code>.codeGen</code> - </p> - - <p><hr><small> -<!-- hhmts start --> -Last modified: Sat Mar 5 22:55:25 EST 2005 -<!-- hhmts end --> - </small> - </body> -</html> diff --git a/docs/comm/the-beast/syntax.html b/docs/comm/the-beast/syntax.html deleted file mode 100644 index be5bbefa17..0000000000 --- a/docs/comm/the-beast/syntax.html +++ /dev/null @@ -1,99 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - Just Syntax</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>The GHC Commentary - Just Syntax</h1> - <p> - The lexical and syntactic analyser for Haskell programs are located in - <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/"><code>fptools/ghc/compiler/parser/</code></a>. - </p> - - <h2>The Lexer</h2> - <p> - The lexer is a rather tedious piece of Haskell code contained in the - module <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/Lex.lhs"><code>Lex</code></a>. - Its complexity partially stems from covering, in addition to Haskell 98, - also the whole range of GHC language extensions plus its ability to - analyse interface files in addition to normal Haskell source. The lexer - defines a parser monad <code>P a</code>, where <code>a</code> is the - type of the result expected from a successful parse. More precisely, a - result of type -<blockquote><pre> -data ParseResult a = POk PState a - | PFailed Message</pre> -</blockquote> - <p> - is produced with <code>Message</code> being from <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/ErrUtils.lhs"><code>ErrUtils</code></a> - (and currently is simply a synonym for <code>SDoc</code>). - <p> - The record type <code>PState</code> contains information such as the - current source location, buffer state, contexts for layout processing, - and whether Glasgow extensions are accepted (either due to - <code>-fglasgow-exts</code> or due to reading an interface file). Most - of the fields of <code>PState</code> store unboxed values; in fact, even - the flag indicating whether Glasgow extensions are enabled is - represented by an unboxed integer instead of by a <code>Bool</code>. My - (= chak's) guess is that this is to avoid having to perform a - <code>case</code> on a boxed value in the inner loop of the lexer. - <p> - The same lexer is used by the Haskell source parser, the Haskell - interface parser, and the package configuration parser. - - <h2>The Haskell Source Parser</h2> - <p> - The parser for Haskell source files is defined in the form of a parser - specification for the parser generator <a - href="http://haskell.org/happy/">Happy</a> in the file <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/Parser.y"><code>Parser.y</code></a>. - The parser exports three entry points for parsing entire modules - (<code>parseModule</code>, individual statements - (<code>parseStmt</code>), and individual identifiers - (<code>parseIdentifier</code>), respectively. The last two are needed - for GHCi. All three require a parser state (of type - <code>PState</code>) and are invoked from <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/HscMain.lhs"><code>HscMain</code></a>. - <p> - Parsing of Haskell is a rather involved process. The most challenging - features are probably the treatment of layout and expressions that - contain infix operators. The latter may be user-defined and so are not - easily captured in a static syntax specification. Infix operators may - also appear in the right hand sides of value definitions, and so, GHC's - parser treats those in the same way as expressions. In other words, as - general expressions are a syntactic superset of expressions - ok, they - <em>nearly</em> are - the parser simply attempts to parse a general - expression in such positions. Afterwards, the generated parse tree is - inspected to ensure that the accepted phrase indeed forms a legal - pattern. This and similar checks are performed by the routines from <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/ParseUtil.lhs"><code>ParseUtil</code></a>. In - some cases, these routines do, in addition to checking for - wellformedness, also transform the parse tree, such that it fits into - the syntactic context in which it has been parsed; in fact, this happens - for patterns, which are transformed from a representation of type - <code>RdrNameHsExpr</code> into a representation of type - <code>RdrNamePat</code>. - - <h2>The Haskell Interface Parser</h2> - <p> - The parser for interface files is also generated by Happy from <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/rename/ParseIface.y"><code>ParseIface.y</code></a>. - It's main routine <code>parseIface</code> is invoked from <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/rename/RnHiFiles.lhs"><code>RnHiFiles</code></a><code>.readIface</code>. - - <h2>The Package Configuration Parser</h2> - <p> - The parser for configuration files is by far the smallest of the three - and defined in <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/ParsePkgConf.y"><code>ParsePkgConf.y</code></a>. - It exports <code>loadPackageConfig</code>, which is used by <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/DriverState.hs"><code>DriverState</code></a><code>.readPackageConf</code>. - - <p><small> -<!-- hhmts start --> -Last modified: Wed Jan 16 00:30:14 EST 2002 -<!-- hhmts end --> - </small> - </body> -</html> diff --git a/docs/comm/the-beast/typecheck.html b/docs/comm/the-beast/typecheck.html deleted file mode 100644 index 482a447628..0000000000 --- a/docs/comm/the-beast/typecheck.html +++ /dev/null @@ -1,316 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - Checking Types</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>The GHC Commentary - Checking Types</h1> - <p> - Probably the most important phase in the frontend is the type checker, - which is located at <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/"><code>fptools/ghc/compiler/typecheck/</code>.</a> - GHC type checks programs in their original Haskell form before the - desugarer converts them into Core code. This complicates the type - checker as it has to handle the much more verbose Haskell AST, but it - improves error messages, as those message are based on the same - structure that the user sees. - </p> - <p> - GHC defines the abstract syntax of Haskell programs in <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/hsSyn/HsSyn.lhs"><code>HsSyn</code></a> - using a structure that abstracts over the concrete representation of - bound occurrences of identifiers and patterns. The module <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcHsSyn.lhs"><code>TcHsSyn</code></a> - defines a number of helper function required by the type checker. Note - that the type <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcRnTypes.lhs"><code>TcRnTypes</code></a>.<code>TcId</code> - used to represent identifiers in some signatures during type checking - is, in fact, nothing but a synonym for a <a href="vars.html">plain - <code>Id</code>.</a> - </p> - <p> - It is also noteworthy, that the representations of types changes during - type checking from <code>HsType</code> to <code>TypeRep.Type</code>. - The latter is a <a href="types.html">hybrid type representation</a> that - is used to type Core, but still contains sufficient information to - recover source types. In particular, the type checker maintains and - compares types in their <code>Type</code> form. - </p> - - <h2>The Overall Flow of Things</h2> - - <h4>Entry Points Into the Type Checker</h4> - <p> - The interface of the type checker (and <a - href="renamer.html">renamer</a>) to the rest of the compiler is provided - by <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcRnDriver.lhs"><code>TcRnDriver</code></a>. - Entire modules are processed by calling <code>tcRnModule</code> and GHCi - uses <code>tcRnStmt</code>, <code>tcRnExpr</code>, and - <code>tcRnType</code> to typecheck statements and expressions, and to - kind check types, respectively. Moreover, <code>tcRnExtCore</code> is - provided to typecheck external Core code. Moreover, - <code>tcTopSrcDecls</code> is used by Template Haskell - more - specifically by <code>TcSplice.tc_bracket</code> - - to type check the contents of declaration brackets. - </p> - - <h4>Renaming and Type Checking a Module</h4> - <p> - The function <code>tcRnModule</code> controls the complete static - analysis of a Haskell module. It sets up the combined renamer and type - checker monad, resolves all import statements, initiates the actual - renaming and type checking process, and finally, wraps off by processing - the export list. - </p> - <p> - The actual type checking and renaming process is initiated via - <code>TcRnDriver.tcRnSrcDecls</code>, which uses a helper called - <code>tc_rn_src_decls</code> to implement the iterative renaming and - type checking process required by <a href="../exts/th.html">Template - Haskell</a>. However, before it invokes <code>tc_rn_src_decls</code>, - it takes care of hi-boot files; afterwards, it simplifies type - constraints and zonking (see below regarding the later). - </p> - <p> - The function <code>tc_rn_src_decls</code> partitions static analysis of - a whole module into multiple rounds, where the initial round is followed - by an additional one for each toplevel splice. It collects all - declarations up to the next splice into an <code>HsDecl.HsGroup</code> - to rename and type check that <em>declaration group</em> by calling - <code>TcRnDriver.tcRnGroup</code>. Afterwards, it executes the - splice (if there are any left) and proceeds to the next group, which - includes the declarations produced by the splice. - </p> - <p> - The function <code>tcRnGroup</code>, finally, gets down to invoke the - actual renaming and type checking via - <code>TcRnDriver.rnTopSrcDecls</code> and - <code>TcRnDriver.tcTopSrcDecls</code>, respectively. The renamer, apart - from renaming, computes the global type checking environment, of type - <code>TcRnTypes.TcGblEnv</code>, which is stored in the type checking - monad before type checking commences. - </p> - - <h2>Type Checking a Declaration Group</h2> - <p> - The type checking of a declaration group, performed by - <code>tcTopSrcDecls</code> starts by processing of the type and class - declarations of the current module, using the function - <code>TcTyClsDecls.tcTyAndClassDecls</code>. This is followed by a - first round over instance declarations using - <code>TcInstDcls.tcInstDecls1</code>, which in particular generates all - additional bindings due to the deriving process. Then come foreign - import declarations (<code>TcForeign.tcForeignImports</code>) and - default declarations (<code>TcDefaults.tcDefaults</code>). - </p> - <p> - Now, finally, toplevel value declarations (including derived ones) are - type checked using <code>TcBinds.tcTopBinds</code>. Afterwards, - <code>TcInstDcls.tcInstDecls2</code> traverses instances for the second - time. Type checking concludes with processing foreign exports - (<code>TcForeign.tcForeignExports</code>) and rewrite rules - (<code>TcRules.tcRules</code>). Finally, the global environment is - extended with the new bindings. - </p> - - <h2>Type checking Type and Class Declarations</h2> - <p> - Type and class declarations are type checked in a couple of phases that - contain recursive dependencies - aka <em>knots.</em> The first knot - encompasses almost the whole type checking of these declarations and - forms the main piece of <code>TcTyClsDecls.tcTyAndClassDecls</code>. - </p> - <p> - Inside this big knot, the first main operation is kind checking, which - again involves a knot. It is implemented by <code>kcTyClDecls</code>, - which performs kind checking of potentially recursively-dependent type - and class declarations using kind variables for initially unknown kinds. - During processing the individual declarations some of these variables - will be instantiated depending on the context; the rest gets by default - kind <code>*</code> (during <em>zonking</em> of the kind signatures). - Type synonyms are treated specially in this process, because they can - have an unboxed type, but they cannot be recursive. Hence, their kinds - are inferred in dependency order. Moreover, in contrast to class - declarations and other type declarations, synonyms are not entered into - the global environment as a global <code>TyThing</code>. - (<code>TypeRep.TyThing</code> is a sum type that combines the various - flavours of typish entities, such that they can be stuck into type - environments and similar.) - </p> - - <h2>More Details</h2> - - <h4>Types Variables and Zonking</h4> - <p> - During type checking type variables are represented by mutable variables - - cf. the <a href="vars.html#TyVar">variable story.</a> Consequently, - unification can instantiate type variables by updating those mutable - variables. This process of instantiation is (for reasons that elude me) - called <a - href="http://www.dictionary.com/cgi-bin/dict.pl?term=zonk&db=*">zonking</a> - in GHC's sources. The zonking routines for the various forms of Haskell - constructs are responsible for most of the code in the module <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcHsSyn.lhs"><code>TcHsSyn</code>,</a> - whereas the routines that actually operate on mutable types are defined - in <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcMType.lhs"><code>TcMType</code></a>; - this includes the zonking of type variables and type terms, routines to - create mutable structures and update them as well as routines that check - constraints, such as that type variables in function signatures have not - been instantiated during type checking. The actual type unification - routine is <code>uTys</code> in the module <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcUnify.lhs"><code>TcUnify</code></a>. - </p> - <p> - All type variables that may be instantiated (those in signatures - may not), but haven't been instantiated during type checking, are zonked - to <code>()</code>, so that after type checking all mutable variables - have been eliminated. - </p> - - <h4>Type Representation</h4> - <p> - The representation of types is fixed in the module <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcRep.lhs"><code>TcRep</code></a> - and exported as the data type <code>Type</code>. As explained in <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcType.lhs"><code>TcType</code></a>, - GHC supports rank-N types, but, in the type checker, maintains the - restriction that type variables cannot be instantiated to quantified - types (i.e., the type system is predicative). The type checker floats - universal quantifiers outside and maintains types in prenex form. - (However, quantifiers can, of course, not float out of negative - positions.) Overall, we have - </p> - <blockquote> - <pre> -sigma -> forall tyvars. phi -phi -> theta => rho -rho -> sigma -> rho - | tau -tau -> tyvar - | tycon tau_1 .. tau_n - | tau_1 tau_2 - | tau_1 -> tau_2</pre> - </blockquote> - <p> - where <code>sigma</code> is in prenex form; i.e., there is never a - forall to the right of an arrow in a <code>phi</code> type. Moreover, a - type of the form <code>tau</code> never contains a quantifier (which - includes arguments to type constructors). - </p> - <p> - Of particular interest are the variants <code>SourceTy</code> and - <code>NoteTy</code> of <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TypeRep.lhs"><code>TypeRep</code></a>.<code>Type</code>. - The constructor <code>SourceTy :: SourceType -> Type</code> represents a - type constraint; that is, a predicate over types represented by a - dictionary. The type checker treats a <code>SourceTy</code> as opaque, - but during the translation to core it will be expanded into its concrete - representation (i.e., a dictionary type) by the function <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/types/Type.lhs"><code>Type</code></a>.<code>sourceTypeRep</code>. - Note that newtypes are not covered by <code>SourceType</code>s anymore, - even if some comments in GHC still suggest this. Instead, all newtype - applications are initially represented as a <code>NewTcApp</code>, until - they are eliminated by calls to <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/types/Type.lhs"><code>Type</code></a>.<code>newTypeRep</code>. - </p> - <p> - The <code>NoteTy</code> constructor is used to add non-essential - information to a type term. Such information has the type - <code>TypeRep.TyNote</code> and is either the set of free type variables - of the annotated expression or the unexpanded version of a type synonym. - Free variables sets are cached as notes to save the overhead of - repeatedly computing the same set for a given term. Unexpanded type - synonyms are useful for generating comprehensible error messages, but - have no influence on the process of type checking. - </p> - - <h4>Type Checking Environment</h4> - <p> - During type checking, GHC maintains a <em>type environment</em> whose - type definitions are fixed in the module <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcRnTypes.lhs"><code>TcRnTypes</code></a> with the operations defined in -<a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcEnv.lhs"><code>TcEnv</code></a>. - Among other things, the environment contains all imported and local - instances as well as a list of <em>global</em> entities (imported and - local types and classes together with imported identifiers) and - <em>local</em> entities (locally defined identifiers). This environment - is threaded through the type checking monad, whose support functions - including initialisation can be found in the module <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcRnMonad.lhs"><code>TcRnMonad</code>.</a> - - <h4>Expressions</h4> - <p> - Expressions are type checked by <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcExpr.lhs"><code>TcExpr</code>.</a> - <p> - Usage occurrences of identifiers are processed by the function - <code>tcId</code> whose main purpose is to <a href="#inst">instantiate - overloaded identifiers.</a> It essentially calls - <code>TcInst.instOverloadedFun</code> once for each universally - quantified set of type constraints. It should be noted that overloaded - identifiers are replaced by new names that are first defined in the LIE - (Local Instance Environment?) and later promoted into top-level - bindings. - - <h4><a name="inst">Handling of Dictionaries and Method Instances</a></h4> - <p> - GHC implements overloading using so-called <em>dictionaries.</em> A - dictionary is a tuple of functions -- one function for each method in - the class of which the dictionary implements an instance. During type - checking, GHC replaces each type constraint of a function with one - additional argument. At runtime, the extended function gets passed a - matching class dictionary by way of these additional arguments. - Whenever the function needs to call a method of such a class, it simply - extracts it from the dictionary. - <p> - This sounds simple enough; however, the actual implementation is a bit - more tricky as it wants to keep track of all the instances at which - overloaded functions are used in a module. This information is useful - to optimise the code. The implementation is the module <a - href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/Inst.lhs"><code>Inst.lhs</code>.</a> - <p> - The function <code>instOverloadedFun</code> is invoked for each - overloaded usage occurrence of an identifier, where overloaded means that - the type of the idendifier contains a non-trivial type constraint. It - proceeds in two steps: (1) Allocation of a method instance - (<code>newMethodWithGivenTy</code>) and (2) instantiation of functional - dependencies. The former implies allocating a new unique identifier, - which replaces the original (overloaded) identifier at the currently - type-checked usage occurrence. - <p> - The new identifier (after being threaded through the LIE) eventually - will be bound by a top-level binding whose rhs contains a partial - application of the original overloaded identifier. This papp applies - the overloaded function to the dictionaries needed for the current - instance. In GHC lingo, this is called a <em>method.</em> Before - becoming a top-level binding, the method is first represented as a value - of type <code>Inst.Inst</code>, which makes it easy to fold multiple - instances of the same identifier at the same types into one global - definition. (And probably other things, too, which I haven't - investigated yet.) - - <p> - <strong>Note:</strong> As of 13 January 2001 (wrt. to the code in the - CVS HEAD), the above mechanism interferes badly with RULES pragmas - defined over overloaded functions. During instantiation, a new name is - created for an overloaded function partially applied to the dictionaries - needed in a usage position of that function. As the rewrite rule, - however, mentions the original overloaded name, it won't fire anymore - -- unless later phases remove the intermediate definition again. The - latest CVS version of GHC has an option - <code>-fno-method-sharing</code>, which avoids sharing instantiation - stubs. This is usually/often/sometimes sufficient to make the rules - fire again. - - <p><small> -<!-- hhmts start --> -Last modified: Thu May 12 22:52:46 EST 2005 -<!-- hhmts end --> - </small> - </body> -</html> diff --git a/docs/comm/the-beast/types.html b/docs/comm/the-beast/types.html deleted file mode 100644 index 383b71f054..0000000000 --- a/docs/comm/the-beast/types.html +++ /dev/null @@ -1,179 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - Hybrid Types</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>The GHC Commentary - Hybrid Types</h1> - <p> - GHC essentially supports two type systems: (1) the <em>source type - system</em> (which is a heavily extended version of the type system of - Haskell 98) and (2) the <em>Core type system,</em> which is the type system - used by the intermediate language (see also <a - href="desugar.html">Sugar Free: From Haskell To Core</a>). - </p> - <p> - During parsing and renaming, type information is represented in a form - that is very close to Haskell's concrete syntax; it is defined by - <code>HsTypes.HsType</code>. In addition, type, class, and instance - declarations are maintained in their source form as defined in the - module <code>HsDecl</code>. The situation changes during type checking, - where types are translated into a second representation, which is - defined in the module <code>types/TypeRep.lhs</code>, as type - <code>Type</code>. This second representation is peculiar in that it is - a hybrid between the source representation of types and the Core - representation of types. Using functions, such as - <code>Type.coreView</code> and <code>Type.deepCoreView</code>, a value - of type <code>Type</code> exhibits its Core representation. On the - other hand, pretty printing a <code>Type</code> with - <code>TypeRep.pprType</code> yields the type's source representation. - </p> - <p> - In fact, the <a href="typecheck.html">type checker</a> maintains type - environments based on <code>Type</code>, but needs to perform type - checking on source-level types. As a result, we have functions - <code>Type.tcEqType</code> and <code>Type.tcCmpType</code>, which - compare types based on their source representation, as well as the - function <code>coreEqType</code>, which compares them based on their - core representation. The latter is needed during type checking of Core - (as performed by the functions in the module - <code>coreSyn/CoreLint.lhs</code>). - </p> - - <h2>Type Synonyms</h2> - <p> - Type synonyms in Haskell are essentially a form of macro definitions on - the type level. For example, when the type checker compares two type - terms, synonyms are always compared in their expanded form. However, to - produce good error messages, we like to avoid expanding type synonyms - during pretty printing. Hence, <code>Type</code> has a variant - <code>NoteTy TyNote Type</code>, where - </p> - <blockquote> - <pre> -data TyNote - = FTVNote TyVarSet -- The free type variables of the noted expression - - | SynNote Type -- Used for type synonyms - -- The Type is always a TyConApp, and is the un-expanded form. - -- The type to which the note is attached is the expanded form.</pre> - </blockquote> - <p> - In other words, a <code>NoteTy</code> represents the expanded form of a - type synonym together with a note stating its source form. - </p> - - <h3>Creating Representation Types of Synonyms</h3> - <p> - During translation from <code>HsType</code> to <code>Type</code> the - function <code>Type.mkSynTy</code> is used to construct representations - of applications of type synonyms. It creates a <code>NoteTy</code> node - if the synonym is applied to a sufficient number of arguments; - otherwise, it builds a simple <code>TyConApp</code> and leaves it to - <code>TcMType.checkValidType</code> to pick up invalid unsaturated - synonym applications. While creating a <code>NoteTy</code>, - <code>mkSynTy</code> also expands the synonym by substituting the type - arguments for the parameters of the synonym definition, using - <code>Type.substTyWith</code>. - </p> - <p> - The function <code>mkSynTy</code> is used indirectly via - <code>mkGenTyConApp</code>, <code>mkAppTy</code>, and - <code>mkAppTy</code>, which construct type representations involving - type applications. The function <code>mkSynTy</code> is also used - directly during type checking interface files; this is for tedious - reasons to do with forall hoisting - see the comment at - <code>TcIface.mkIfTcApp</code>. - </p> - - <h2>Newtypes</h2> - <p> - Data types declared by a <code>newtype</code> declarations constitute new - type constructors---i.e., they are not just type macros, but introduce - new type names. However, provided that a newtype is not recursive, we - still want to implement it by its representation type. GHC realises this - by providing two flavours of type equality: (1) <code>tcEqType</code> is - source-level type equality, which compares newtypes and - <code>PredType</code>s by name, and (2) <code>coreEqType</code> compares - them structurally (by using <code>deepCoreView</code> to expand the - representation before comparing). The function - <code>deepCoreView</code> (via <code>coreView</code>) invokes - <code>expandNewTcApp</code> for every type constructor application - (<code>TyConApp</code>) to determine whether we are looking at a newtype - application that needs to be expanded to its representation type. - </p> - - <h2>Predicates</h2> - <p> - The dictionary translation of type classes, translates each predicate in - a type context of a type signature into an additional argument, which - carries a dictionary with the functions overloaded by the corresponding - class. The <code>Type</code> data type has a special variant - <code>PredTy PredType</code> for predicates, where - </p> - <blockquote> - <pre> -data PredType - = ClassP Class [Type] -- Class predicate - | IParam (IPName Name) Type -- Implicit parameter</pre> - </blockquote> - <p> - These types need to be handled as source type during type checking, but - turn into their representations when inspected through - <code>coreView</code>. The representation is determined by - <code>Type.predTypeRep</code>. - </p> - - <h2>Representation of Type Constructors</h2> - <p> - Type constructor applications are represented in <code>Type</code> by - the variant <code>TyConApp :: TyCon -> [Type] -> Type</code>. The first - argument to <code>TyConApp</code>, namely <code>TyCon.TyCon</code>, - distinguishes between function type constructors (variant - <code>FunTyCon</code>) and algebraic type constructors (variant - <code>AlgTyCon</code>), which arise from data and newtype declarations. - The variant <code>AlgTyCon</code> contains all the information available - from the data/newtype declaration as well as derived information, such - as the <code>Unique</code> and argument variance information. This - includes a field <code>algTcRhs :: AlgTyConRhs</code>, where - <code>AlgTyConRhs</code> distinguishes three kinds of algebraic data - type declarations: (1) declarations that have been exported abstractly, - (2) <code>data</code> declarations, and (3) <code>newtype</code> - declarations. The last two both include their original right hand side; - in addition, the third variant also caches the "ultimate" representation - type, which is the right hand side after expanding all type synonyms and - non-recursive newtypes. - </p> - <p> - Both data and newtype declarations refer to their data constructors - represented as <code>DataCon.DataCon</code>, which include all details - of their signature (as derived from the original declaration) as well - information for code generation, such as their tag value. - </p> - - <h2>Representation of Classes and Instances</h2> - <p> - Class declarations turn into values of type <code>Class.Class</code>. - They represent methods as the <code>Id</code>s of the dictionary - selector functions. Similar selector functions are available for - superclass dictionaries. - </p> - <p> - Instance declarations turn into values of type - <code>InstEnv.Instance</code>, which in interface files are represented - as <code>IfaceSyn.IfaceInst</code>. Moreover, the type - <code>InstEnv.InstEnv</code>, which is a synonym for <code>UniqFM - ClsInstEnv</code>, provides a mapping of classes to their - instances---<code>ClsInstEnv</code> is essentially a list of instance - declarations. - </p> - - <p><small> -<!-- hhmts start --> -Last modified: Sun Jun 19 13:07:22 EST 2005 -<!-- hhmts end --> - </small></p> - </body> -</html> diff --git a/docs/comm/the-beast/vars.html b/docs/comm/the-beast/vars.html deleted file mode 100644 index 9bbd310c60..0000000000 --- a/docs/comm/the-beast/vars.html +++ /dev/null @@ -1,235 +0,0 @@ -<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> -<html> - <head> - <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> - <title>The GHC Commentary - The Real Story about Variables, Ids, TyVars, and the like</title> - </head> - - <body BGCOLOR="FFFFFF"> - <h1>The GHC Commentary - The Real Story about Variables, Ids, TyVars, and the like</h1> - <p> - - -<h2>Variables</h2> - -The <code>Var</code> type, defined in <code>basicTypes/Var.lhs</code>, -represents variables, both term variables and type variables: -<pre> - data Var - = Var { - varName :: Name, - realUnique :: FastInt, - varType :: Type, - varDetails :: VarDetails, - varInfo :: IdInfo - } -</pre> -<ul> -<li> The <code>varName</code> field contains the identity of the variable: -its unique number, and its print-name. See "<a href="names.html">The truth about names</a>". - -<p><li> The <code>realUnique</code> field caches the unique number in the -<code>varName</code> field, just to make comparison of <code>Var</code>s a little faster. - -<p><li> The <code>varType</code> field gives the type of a term variable, or the kind of a -type variable. (Types and kinds are both represented by a <code>Type</code>.) - -<p><li> The <code>varDetails</code> field distinguishes term variables from type variables, -and makes some further distinctions (see below). - -<p><li> For term variables (only) the <code>varInfo</code> field contains lots of useful -information: strictness, unfolding, etc. However, this information is all optional; -you can always throw away the <code>IdInfo</code>. In contrast, you can't safely throw away -the <code>VarDetails</code> of a <code>Var</code> -</ul> -<p> -It's often fantastically convenient to have term variables and type variables -share a single data type. For example, -<pre> - exprFreeVars :: CoreExpr -> VarSet -</pre> -If there were two types, we'd need to return two sets. Simiarly, big lambdas and -little lambdas use the same constructor in Core, which is extremely convenient. -<p> -We define a couple of type synonyms: -<pre> - type Id = Var -- Term variables - type TyVar = Var -- Type variables -</pre> -just to help us document the occasions when we are expecting only term variables, -or only type variables. - - -<h2> The <code>VarDetails</code> field </h2> - -The <code>VarDetails</code> field tells what kind of variable this is: -<pre> -data VarDetails - = LocalId -- Used for locally-defined Ids (see NOTE below) - LocalIdDetails - - | GlobalId -- Used for imported Ids, dict selectors etc - GlobalIdDetails - - | TyVar - | MutTyVar (IORef (Maybe Type)) -- Used during unification; - TyVarDetails -</pre> - -<a name="TyVar"> -<h2>Type variables (<code>TyVar</code>)</h2> -</a> -<p> -The <code>TyVar</code> case is self-explanatory. The <code>MutTyVar</code> -case is used only during type checking. Then a type variable can be unified, -using an imperative update, with a type, and that is what the -<code>IORef</code> is for. The <code>TcType.TyVarDetails</code> field records -the sort of type variable we are dealing with. It is defined as -<pre> -data TyVarDetails = SigTv | ClsTv | InstTv | VanillaTv -</pre> -<code>SigTv</code> marks type variables that were introduced when -instantiating a type signature prior to matching it against the inferred type -of a definition. The variants <code>ClsTv</code> and <code>InstTv</code> mark -scoped type variables introduced by class and instance heads, respectively. -These first three sorts of type variables are skolem variables (tested by the -predicate <code>isSkolemTyVar</code>); i.e., they must <em>not</em> be -instantiated. All other type variables are marked as <code>VanillaTv</code>. -<p> -For a long time I tried to keep mutable Vars statically type-distinct -from immutable Vars, but I've finally given up. It's just too painful. -After type checking there are no MutTyVars left, but there's no static check -of that fact. - -<h2>Term variables (<code>Id</code>)</h2> - -A term variable (of type <code>Id</code>) is represented either by a -<code>LocalId</code> or a <code>GlobalId</code>: -<p> -A <code>GlobalId</code> is -<ul> -<li> Always bound at top-level. -<li> Always has a <code>GlobalName</code>, and hence has - a <code>Unique</code> that is globally unique across the whole - GHC invocation (a single invocation may compile multiple modules). -<li> Has <code>IdInfo</code> that is absolutely fixed, forever. -</ul> - -<p> -A <code>LocalId</code> is: -<ul> -<li> Always bound in the module being compiled: -<ul> -<li> <em>either</em> bound within an expression (lambda, case, local let(rec)) -<li> <em>or</em> defined at top level in the module being compiled. -</ul> -<li> Has IdInfo that changes as the simpifier bashes repeatedly on it. -</ul> -<p> -The key thing about <code>LocalId</code>s is that the free-variable finder -typically treats them as candidate free variables. That is, it ignores -<code>GlobalId</code>s such as imported constants, data contructors, etc. -<p> -An important invariant is this: <em>All the bindings in the module -being compiled (whether top level or not) are <code>LocalId</code>s -until the CoreTidy phase.</em> In the CoreTidy phase, all -externally-visible top-level bindings are made into GlobalIds. This -is the point when a <code>LocalId</code> becomes "frozen" and becomes -a fixed, immutable <code>GlobalId</code>. -<p> -(A binding is <em>"externally-visible"</em> if it is exported, or -mentioned in the unfolding of an externally-visible Id. An -externally-visible Id may not have an unfolding, either because it is -too big, or because it is the loop-breaker of a recursive group.) - -<h3>Global Ids and implicit Ids</h3> - -<code>GlobalId</code>s are further categorised by their <code>GlobalIdDetails</code>. -This type is defined in <code>basicTypes/IdInfo</code>, because it mentions other -structured types like <code>DataCon</code>. Unfortunately it is *used* in <code>Var.lhs</code> -so there's a <code>hi-boot</code> knot to get it there. Anyway, here's the declaration: -<pre> -data GlobalIdDetails - = NotGlobalId -- Used as a convenient extra return value - -- from globalIdDetails - - | VanillaGlobal -- Imported from elsewhere - - | PrimOpId PrimOp -- The Id for a primitive operator - | FCallId ForeignCall -- The Id for a foreign call - - -- These next ones are all "implicit Ids" - | RecordSelId FieldLabel -- The Id for a record selector - | DataConId DataCon -- The Id for a data constructor *worker* - | DataConWrapId DataCon -- The Id for a data constructor *wrapper* - -- [the only reasons we need to know is so that - -- a) we can suppress printing a definition in the interface file - -- b) when typechecking a pattern we can get from the - -- Id back to the data con] -</pre> -The <code>GlobalIdDetails</code> allows us to go from the <code>Id</code> for -a record selector, say, to its field name; or the <code>Id</code> for a primitive -operator to the <code>PrimOp</code> itself. -<p> -Certain <code>GlobalId</code>s are called <em>"implicit"</em> Ids. An implicit -Id is derived by implication from some other declaration. So a record selector is -derived from its data type declaration, for example. An implicit Ids is always -a <code>GlobalId</code>. For most of the compilation, the implicit Ids are just -that: implicit. If you do -ddump-simpl you won't see their definition. (That's -why it's true to say that until CoreTidy all Ids in this compilation unit are -LocalIds.) But at CorePrep, a binding is added for each implicit Id defined in -this module, so that the code generator will generate code for the (curried) function. -<p> -Implicit Ids carry their unfolding inside them, of course, so they may well have -been inlined much earlier; but we generate the curried top-level defn just in -case its ever needed. - -<h3>LocalIds</h3> - -The <code>LocalIdDetails</code> gives more info about a <code>LocalId</code>: -<pre> -data LocalIdDetails - = NotExported -- Not exported - | Exported -- Exported - | SpecPragma -- Not exported, but not to be discarded either - -- It's unclean that this is so deeply built in -</pre> -From this we can tell whether the <code>LocalId</code> is exported, and that -tells us whether we can drop an unused binding as dead code. -<p> -The <code>SpecPragma</code> thing is a HACK. Suppose you write a SPECIALIZE pragma: -<pre> - foo :: Num a => a -> a - {-# SPECIALIZE foo :: Int -> Int #-} - foo = ... -</pre> -The type checker generates a dummy call to <code>foo</code> at the right types: -<pre> - $dummy = foo Int dNumInt -</pre> -The Id <code>$dummy</code> is marked <code>SpecPragma</code>. Its role is to hang -onto that call to <code>foo</code> so that the specialiser can see it, but there -are no calls to <code>$dummy</code>. -The simplifier is careful not to discard <code>SpecPragma</code> Ids, so that it -reaches the specialiser. The specialiser processes the right hand side of a <code>SpecPragma</code> Id -to find calls to overloaded functions, <em>and then discards the <code>SpecPragma</code> Id</em>. -So <code>SpecPragma</code> behaves a like <code>Exported</code>, at least until the specialiser. - - -<h3> ExternalNames and InternalNames </h3> - -Notice that whether an Id is a <code>LocalId</code> or <code>GlobalId</code> is -not the same as whether the Id has an <code>ExternaName</code> or an <code>InternalName</code> -(see "<a href="names.html#sort">The truth about Names</a>"): -<ul> -<li> Every <code>GlobalId</code> has an <code>ExternalName</code>. -<li> A <code>LocalId</code> might have either kind of <code>Name</code>. -</ul> - -<!-- hhmts start --> -Last modified: Fri Sep 12 15:17:18 BST 2003 -<!-- hhmts end --> - </small> - </body> -</html> - diff --git a/docs/users_guide/7.10.1-notes.xml b/docs/users_guide/7.10.1-notes.xml new file mode 100644 index 0000000000..6d9b9378a1 --- /dev/null +++ b/docs/users_guide/7.10.1-notes.xml @@ -0,0 +1,376 @@ +<?xml version="1.0" encoding="iso-8859-1"?> +<sect1 id="release-7-10-1"> + <title>Release notes for version 7.10.1</title> + + <para> + The significant changes to the various parts of the compiler are listed + in the following sections. There have also been numerous bug fixes and + performance improvements over the 7.8 branch. + </para> + + <sect2> + <title>Highlights</title> + + <para> + The highlights, since the 7.8 branch, are: + </para> + + <itemizedlist> + <listitem> + <para> + TODO FIXME + </para> + </listitem> + </itemizedlist> + </sect2> + + <sect2> + <title>Full details</title> + <sect3> + <title>Language</title> + <itemizedlist> + <listitem> + <para> + Added support for <link linkend="binary-literals">binary integer literals</link> + </para> + </listitem> + </itemizedlist> + </sect3> + + <sect3> + <title>Compiler</title> + <itemizedlist> + <listitem> + <para> + GHC now checks that all the language extensions required for + the inferred type signatures are explicitly enabled. This + means that if any of the type signatures inferred in your + program requires some language extension you will need to + enable it. The motivation is that adding a missing type + signature inferred by GHC should yield a program that + typechecks. Previously this was not the case. + </para> + <para> + This is a breaking change. Code that used to compile in the + past might fail with an error message requiring some + particular language extension (most likely + <option>-XTypeFamilies</option>, <option>-XGADTs</option> or + <option>-XFlexibleContexts</option>). + </para> + </listitem> + </itemizedlist> + </sect3> + + <sect3> + <title>GHCi</title> + <itemizedlist> + <listitem> + <para> + TODO FIXME + </para> + </listitem> + </itemizedlist> + </sect3> + + <sect3> + <title>Template Haskell</title> + <itemizedlist> + <listitem> + <para> + TODO FIXME + </para> + </listitem> + </itemizedlist> + </sect3> + + <sect3> + <title>Runtime system</title> + <itemizedlist> + <listitem> + <para> + TODO FIXME + </para> + </listitem> + </itemizedlist> + </sect3> + + <sect3> + <title>Build system</title> + <itemizedlist> + <listitem> + <para> + TODO FIXME + </para> + </listitem> + </itemizedlist> + </sect3> + </sect2> + + <sect2> + <title>Libraries</title> + + <sect3> + <title>array</title> + <itemizedlist> + <listitem> + <para> + Version number XXXXX (was 0.5.0.0) + </para> + </listitem> + </itemizedlist> + </sect3> + + <sect3> + <title>base</title> + <itemizedlist> + <listitem> + <para> + Version number XXXXX (was 4.7.0.0) + </para> + </listitem> + </itemizedlist> + </sect3> + + <sect3> + <title>bin-package-db</title> + <itemizedlist> + <listitem> + <para> + This is an internal package, and should not be used. + </para> + </listitem> + </itemizedlist> + </sect3> + + <sect3> + <title>binary</title> + <itemizedlist> + <listitem> + <para> + Version number XXXXX (was 0.7.1.0) + </para> + </listitem> + </itemizedlist> + </sect3> + + <sect3> + <title>bytestring</title> + <itemizedlist> + <listitem> + <para> + Version number XXXXX (was 0.10.4.0) + </para> + </listitem> + </itemizedlist> + </sect3> + + <sect3> + <title>Cabal</title> + <itemizedlist> + <listitem> + <para> + Version number XXXXX (was 1.18.1.3) + </para> + </listitem> + </itemizedlist> + </sect3> + + <sect3> + <title>containers</title> + <itemizedlist> + <listitem> + <para> + Version number XXXXX (was 0.5.4.0) + </para> + </listitem> + </itemizedlist> + </sect3> + + <sect3> + <title>deepseq</title> + <itemizedlist> + <listitem> + <para> + Version number XXXXX (was 1.3.0.2) + </para> + </listitem> + </itemizedlist> + </sect3> + + <sect3> + <title>directory</title> + <itemizedlist> + <listitem> + <para> + Version number XXXXX (was 1.2.0.2) + </para> + </listitem> + </itemizedlist> + </sect3> + + <sect3> + <title>filepath</title> + <itemizedlist> + <listitem> + <para> + Version number XXXXX (was 1.3.0.2) + </para> + </listitem> + </itemizedlist> + </sect3> + + <sect3> + <title>ghc-prim</title> + <itemizedlist> + <listitem> + <para> + Version number XXXXX (was 0.3.1.0) + </para> + </listitem> + </itemizedlist> + </sect3> + + <sect3> + <title>haskell98</title> + <itemizedlist> + <listitem> + <para> + Version number XXXXX (was 2.0.0.3) + </para> + </listitem> + </itemizedlist> + </sect3> + + <sect3> + <title>haskell2010</title> + <itemizedlist> + <listitem> + <para> + Version number XXXXX (was 1.1.1.1) + </para> + </listitem> + </itemizedlist> + </sect3> + + <sect3> + <title>hoopl</title> + <itemizedlist> + <listitem> + <para> + Version number XXXXX (was 3.10.0.0) + </para> + </listitem> + </itemizedlist> + </sect3> + + <sect3> + <title>hpc</title> + <itemizedlist> + <listitem> + <para> + Version number XXXXX (was 0.6.0.1) + </para> + </listitem> + </itemizedlist> + </sect3> + + <sect3> + <title>integer-gmp</title> + <itemizedlist> + <listitem> + <para> + Version number XXXXX (was 0.5.1.0) + </para> + </listitem> + </itemizedlist> + </sect3> + + <sect3> + <title>old-locale</title> + <itemizedlist> + <listitem> + <para> + Version number XXXXX (was 1.0.0.6) + </para> + </listitem> + </itemizedlist> + </sect3> + + <sect3> + <title>old-time</title> + <itemizedlist> + <listitem> + <para> + Version number XXXXX (was 1.1.0.2) + </para> + </listitem> + </itemizedlist> + </sect3> + + <sect3> + <title>process</title> + <itemizedlist> + <listitem> + <para> + Version number XXXXX (was 1.2.0.0) + </para> + </listitem> + </itemizedlist> + </sect3> + + <sect3> + <title>template-haskell</title> + <itemizedlist> + <listitem> + <para> + Version number XXXXX (was 2.9.0.0) + </para> + </listitem> + </itemizedlist> + </sect3> + + <sect3> + <title>time</title> + <itemizedlist> + <listitem> + <para> + Version number XXXXX (was 1.4.1) + </para> + </listitem> + </itemizedlist> + </sect3> + + <sect3> + <title>unix</title> + <itemizedlist> + <listitem> + <para> + Version number XXXXX (was 2.7.0.0) + </para> + </listitem> + </itemizedlist> + </sect3> + + <sect3> + <title>Win32</title> + <itemizedlist> + <listitem> + <para> + Version number XXXXX (was 2.3.0.1) + </para> + </listitem> + </itemizedlist> + </sect3> + </sect2> + + <sect2> + <title>Known bugs</title> + <itemizedlist> + <listitem> + <para> + TODO FIXME + </para> + </listitem> + </itemizedlist> + </sect2> +</sect1> diff --git a/docs/users_guide/7.8.1-notes.xml b/docs/users_guide/7.8.1-notes.xml deleted file mode 100644 index 36b0ad52a7..0000000000 --- a/docs/users_guide/7.8.1-notes.xml +++ /dev/null @@ -1,1251 +0,0 @@ -<?xml version="1.0" encoding="iso-8859-1"?> -<sect1 id="release-7-8-1"> - <title>Release notes for version 7.8.1</title> - - <para> - The significant changes to the various parts of the compiler are listed - in the following sections. There have also been numerous bug fixes and - performance improvements over the 7.6 branch. - </para> - - <sect2> - <title>Highlights</title> - - <para> - The highlights, since the 7.6 branch, are: - </para> - - <itemizedlist> - <listitem> - <para> - OS X Mavericks with XCode 5 is now properly supported - by GHC. As a result of this, GHC now uses Clang to - preprocess Haskell code by default for Mavericks - builds. - </para> - - <para> - Note that normally, GHC used <literal>gcc</literal> as - the preprocessor for Haskell code (as it was the - default everywhere,) which implements - <literal>-traditional</literal> behavior. However, - Clang is not 100% compatible with GCC's - <literal>-traditional</literal> as it is rather - implementation specified and does not match any - specification. Clang is also more strict. - </para> - - <para> - As a result of this, when using Clang as the - preprocessor, some programs which previously used - <literal>-XCPP</literal> and the preprocessor will now - fail to compile. Users who wish to retain the previous - behavior are better off using cpphs as an external - preprocessor for the time being. - </para> - - <para> - In the future, we hope to fix this by adopting a - better preprocessor implementation independent of the - C compiler (perhaps cpphs itself,) and ship that - instead. - </para> - </listitem> - - <listitem> - <para> - By default, GHC has a new warning enabled, - <literal>-fwarn-typed-holes</literal>, which causes the - compiler to respond with the types of unbound - variables it encounters in the source code. (It is - reminiscient of the "holes" feature in languages such - as Agda.) - - For more information, see <xref linkend="typed-holes"/>. - </para> - </listitem> - - <listitem> - <para> - GHC can now perform simple evaluation of type-level - natural numbers, when using the - <literal>DataKinds</literal> extension. For example, - given a type-level constraint such as <literal>(x + 3) - ~ 5</literal>, GHC is able to infer that - <literal>x</literal> is 2. Similarly, GHC can now - understand type-level identities such as <literal>x + - 0 ~ x</literal>. - </para> - - <para> - Note that the solving of these equations is only used - to resolve unification variables - it does not - generate new facts in the type checker. This is - similar to how functional dependencies work. - </para> - </listitem> - - <listitem> - <para> - It is now possible to declare a 'closed' <literal>type - family</literal> when using the - <literal>TypeFamilies</literal> extension. A closed - <literal>type family</literal> cannot have any - instances created other than the ones in its - definition. - - For more information, see <xref linkend="closed-type-families"/>. - </para> - </listitem> - - <listitem> - <para> - Use of the <literal>GeneralizedNewtypeDeriving</literal> - extension is now subject to <emphasis>role checking</emphasis>, - to ensure type safety of the derived instances. As this change - increases the type safety of GHC, it is possible that some code - that previously compiled will no longer work. - - For more information, see <xref linkend="roles"/>. - </para> - </listitem> - - <listitem> - <para> - GHC now supports overloading list literals using the new - <literal>OverloadedLists</literal> extension. - - For more information, see <xref linkend="overloaded-lists"/>. - </para> - </listitem> - - <listitem> - <para> - GHC now supports pattern synonyms, enabled by the - <literal>-XPatternSynonyms</literal> extension, - allowing you to name and abstract over patterns more - easily. - - For more information, see <xref linkend="pattern-synonyms"/>. - </para> - <para> - Note: For the GHC 7.8.1 version, this language feature - should be regarded as a preview. - </para> - </listitem> - - <listitem> - <para> - There has been significant overhaul of the type - inference engine and constraint solver, meaning it - should be faster and use less memory. - </para> - </listitem> - - <listitem> - <para> - By default, GHC will now unbox all "small" strict - fields in a data type. A "small" data type is one - whose size is equivalent to or smaller than the native - word size of the machine. This means you no longer - have to specify <literal>UNPACK</literal> pragmas for - e.g. strict <literal>Int</literal> fields. This also - applies to floating-point values. - </para> - </listitem> - - <listitem> - <para> - GHC now has a brand-new I/O manager that scales significantly - better for larger workloads compared to the previous one. It - should scale linearly up to approximately 32 cores. - </para> - </listitem> - - <listitem> - <para> - The LLVM backend now supports 128- and 256-bit SIMD - operations. - </para> - <para> - Note carefully: this is <emphasis>only</emphasis> available with - the LLVM backend, and should be considered - experimental. - </para> - </listitem> - - <listitem> - <para> - The new code generator, after significant work by many - individuals over the past several years, is now enabled by - default. This is a complete rewrite of the STG to Cmm - transformation. In general, your programs may get slightly - faster. - </para> - - <para> - The old code generator has been removed completely. - </para> - </listitem> - - <listitem> - <para> - GHC now has substantially better support for cross - compilation. In particular, GHC now has all the - necessary patches to support cross compilation to - Apple iOS, using the LLVM backend. - </para> - </listitem> - - <listitem> - <para> - PrimOps for comparing unboxed values now return - <literal>Int#</literal> instead of <literal>Bool</literal>. - This change is backwards incompatible. See - <ulink url="http://ghc.haskell.org/trac/ghc/wiki/NewPrimopsInGHC7.8"> - this GHC wiki page</ulink> for instructions how to update your - existing code. See <ulink url="http://ghc.haskell.org/trac/ghc/wiki/PrimBool"> - here</ulink> for motivation and discussion of implementation details. - </para> - </listitem> - - <listitem> - <para> - New PrimOps for atomic memory operations. - The <literal>casMutVar#</literal> PrimOp was introduced in - GHC 7.2 (debugged in 7.4). This release also includes additional - PrimOps for compare-and-swap (<literal>casArray#</literal> and - <literal>casIntArray#</literal>) and one for fetch-and-add - (<literal>fetchAddIntArray#</literal>). - </para> - </listitem> - - <listitem> - <para> - On Linux, FreeBSD and Mac OS X, GHCi now uses the - system dynamic linker by default, instead of its built - in (static) object linker. This is more robust - cross-platform, and fixes many long-standing bugs (for - example: constructors and destructors, weak symbols, - etc work correctly, and several edge cases in the RTS - are fixed.) - </para> - - <para> - As a result of this, GHCi (and Template Haskell) must - now load <emphasis>dynamic</emphasis> object files, not static - ones. To assist this, there is a new compilation flag, - <literal>-dynamic-too</literal>, which when used - during compilation causes GHC to emit both static and - dynamic object files at the same time. GHC itself - still defaults to static linking. - </para> - - <para> - Note that Cabal will correctly handle - <literal>-dynamic-too</literal> for you automatically, - especially when <literal>-XTemplateHaskell</literal> - is needed - but you <emphasis>must</emphasis> tell Cabal you are - using the <literal>TemplateHaskell</literal> - extension. - </para> - - <para> - Note that you must be using Cabal and Cabal-install - 1.18 for it to correctly build dynamic shared libraries - for you. - </para> - - <para> - Currently, Dynamic GHCi and - <literal>-dynamic-too</literal> are not supported on - Windows (32bit or 64bit.) - </para> - </listitem> - - <listitem> - <para> - <literal>Typeable</literal> is now poly-kinded, making - <literal>Typeable1</literal>, <literal>Typeable2</literal>, - etc., obsolete, deprecated, and relegated to - <literal>Data.OldTypeable</literal>. Furthermore, user-written - instances of <literal>Typeable</literal> are now disallowed: - use <literal>deriving</literal> or the new extension - <literal>-XAutoDeriveTypeable</literal>, which will create - <literal>Typeable</literal> instances for every datatype - declared in the module. - </para> - </listitem> - - <listitem> - <para> - GHC now has a parallel compilation driver. When - compiling with <literal>--make</literal> (which is on - by default,) you may also specify - <literal>-jN</literal> in order to compile - <replaceable>N</replaceable> modules in - parallel. (Note: this will automatically scale on - multicore machines without specifying <literal>+RTS - -N</literal> to the compiler.) - </para> - </listitem> - - <listitem> - <para> - GHC now has support for a new pragma, - <literal>{-# MINIMAL #-}</literal>, allowing you to - explicitly declare the minimal complete definition of - a class. Should an instance not provide the minimal - required definitions, a warning will be emitted. - See <xref linkend="minimal-pragma"/> for details. - </para> - </listitem> - - <listitem> - <para> - In GHC 7.10, <literal>Applicative</literal> will - become a superclass of <literal>Monad</literal>, - potentially breaking a lot of user code. To ease this - transition, GHC now generates warnings when - definitions conflict with the Applicative-Monad - Proposal (AMP). - </para> - - <para> - A warning is emitted if a type is an instance of - <literal>Monad</literal> but not of - <literal>Applicative</literal>, - <literal>MonadPlus</literal> but not - <literal>Alternative</literal>, and when a local - function named <literal>join</literal>, - <literal><*></literal> or <literal>pure</literal> is - defined. - </para> - - <para> - The warnings are enabled by default, and can be controlled - using the new flag <literal>-f[no-]warn-amp</literal>. - </para> - </listitem> - - <listitem> - <para> - Using the new <literal>InterruptibleFFI</literal> - extension, it's possible to now declare a foreign - import as <literal>interruptible</literal>, as opposed - to only <literal>safe</literal> or - <literal>unsafe</literal>. An - <literal>interruptible</literal> foreign call is the - same as a <literal>safe</literal> call, but may be - interrupted by asynchronous <emphasis>Haskell - exceptions</emphasis>, such as those generated by - <literal>throwTo</literal> or - <literal>timeout</literal>. - </para> - - <para> - For more information (including the exact details on - how the foreign thread is interrupted,) see <xref - linkend="ffi-interruptible"/>. - </para> - </listitem> - - <listitem> - <para> - GHC's internal compiler pipeline is now exposed - through a <literal>Hooks</literal> module inside the - GHC API. These hooks allow you to control most of the - internal compiler phase machinery, including compiling - expressions, phase control, and linking. - </para> - - <para> - Note: this interface will likely see continuous - refinement and API changes in future releases, so it - should be considered a preview. - </para> - </listitem> - <listitem> - <para> - The LLVM code generator has been fixed to support - dynamic linking. This enables runtime-linking - (e.g. GHCi) support for architectures without support in - GHC's own runtime linker (e.g. ARM). - </para> - <para> - Note: Tables-next-to-code is disabled when building on - ARM with binutil's ld due to a - <ulink url="https://sourceware.org/bugzilla/show_bug.cgi?id=16177"> - bug</ulink> in ld. - </para> - </listitem> - </itemizedlist> - </sect2> - - <sect2> - <title>Full details</title> - <sect3> - <title>Language</title> - <itemizedlist> - <listitem> - <para> - There is a new extension, - <literal>NullaryTypeClasses</literal>, which - allows you to declare a type class without any - parameters. - </para> - </listitem> - </itemizedlist> - - <itemizedlist> - <listitem> - <para> - There is a new extension, - <literal>NumDecimals</literal>, which allows you - to specify an integer using compact "floating - literal" syntax. This lets you say things like - <literal>1.2e6 :: Integer</literal> instead of - <literal>1200000</literal> - </para> - </listitem> - </itemizedlist> - - <itemizedlist> - <listitem> - <para> - There is a new extension, - <literal>NegativeLiterals</literal>, which will - cause GHC to interpret the expression - <literal>-123</literal> as <literal>fromIntegral - (-123)</literal>. Haskell 98 and Haskell 2010 both - specify that it should instead desugar to - <literal>negate (fromIntegral 123)</literal> - </para> - </listitem> - </itemizedlist> - - <itemizedlist> - <listitem> - <para> - There is a new extension, - <literal>EmptyCase</literal>, which allows - to write a case expression with no alternatives - <literal>case ... of {}</literal>. - </para> - </listitem> - </itemizedlist> - - <itemizedlist> - <listitem> - <para> - The <literal>IncoherentInstances</literal> - extension has seen a behavioral change, and is - now 'liberated' and less conservative during - instance resolution. This allows more programs to - compile than before. - </para> - <para> - Now, <literal>IncoherentInstances</literal> will - always pick an arbitrary matching instance, if - multiple ones exist. - </para> - </listitem> - </itemizedlist> - - <itemizedlist> - <listitem> - <para> - A new built-in function <literal>coerce</literal> is - provided that allows to safely coerce values between types - that have the same run-time-presentation, such as - newtypes, but also newtypes inside containers. See the - haddock documentation of - <ulink url="&libraryBaseLocation;/Data-Coerce.html#v%3Acoerce">coerce</ulink> - and of the class - <ulink url="&libraryBaseLocation;/Data-Coerce.html#t%3ACoercible">Coercible</ulink> - for more details. - </para> - <para> - This feature is included in this release as a technology - preview, and may change its syntax and/or semantics in the - next release. - </para> - </listitem> - </itemizedlist> - - <itemizedlist> - <listitem> - <para> - The new pragma, <literal>{-# MINIMAL #-}</literal>, - allows to explicitly declare the minimal complete - definition of a class. Should an instance not provide - the minimal required definitions, a warning will be - emitted. - </para> - - <para> - See <xref linkend="minimal-pragma"/> for more details. - </para> - </listitem> - </itemizedlist> - </sect3> - - <sect3> - <title>Compiler</title> - <itemizedlist> - <listitem> - <para> - GHC can now build both static and dynamic object - files at the same time in a single compilation - pass, when given the - <literal>-dynamic-too</literal> flag. This will - produce both a statically-linkable - <literal>.o</literal> object file, and a - dynamically-linkable <literal>.dyn_o</literal> - file. The output suffix of the dynamic objects can - be controlled by the flag - <literal>-dynosuf</literal>. - </para> - - <para> - Note that GHC still builds statically by default. - </para> - </listitem> - <listitem> - <para> - GHC now supports a - <literal>--show-options</literal> flag, which will - dump all of the flags it supports to standard out. - </para> - </listitem> - <listitem> - <para> - GHC now supports warning about overflow of integer - literals, enabled by - <literal>-fwarn-overflowed-literals</literal>. It - is enabled by default. - </para> - </listitem> - <listitem> - <para> - It's now possible to switch the system linker on Linux - (between GNU gold and GNU ld) at runtime without problem. - </para> - </listitem> - <listitem> - <para> - The <literal>-fwarn-dodgy-imports</literal> flag now warns - in the case an <literal>import</literal> statement hides an - entity which is not exported. - </para> - </listitem> - <listitem> - <para> - The LLVM backend was overhauled and rewritten, and - should hopefully be easier to maintain and work on - in the future. - </para> - </listitem> - <listitem> - <para> - GHC now detects annotation changes during - recompilation, and correctly persists new - annotations. - </para> - </listitem> - <listitem> - <para> - There is a new set of primops for utilizing - hardware-based prefetch instructions, to help - guide the processor's caching decisions. - </para> - <para> - Currently, the primops get translated into - the associated hardware supported prefetch - instructions only with the LLVM backend and - x86/amd64 backends. On all other backends, - the prefetch primops are currently erased - at code generation time. - </para> - </listitem> - </itemizedlist> - </sect3> - - <sect3> - <title>GHCi</title> - <itemizedlist> - <listitem> - The monomorphism restriction is now turned off - by default in GHCi. - </listitem> - - <listitem> - <para> - GHCi now supports a <literal>prompt2</literal> - setting, which allows you to customize the - continuation prompt of multi-line input. - - For more information, see <xref linkend="ghci-commands"/>. - </para> - </listitem> - <listitem> - <para> - The new <literal>:shows paths</literal> command - shows the current working directory and the - current search path for Haskell modules. - </para> - </listitem> - - <listitem> - <para> - On Linux, the static GHCi linker now supports weak symbols. - </para> - </listitem> - - <listitem> - <para> - The (static) GHCi linker (except 64-bit Windows) now runs - constructors for linked libraries. This means for example - that C code using - <literal>__attribute__((constructor))</literal> - can now properly be loaded into GHCi. - </para> - - <para> - Note: destructors are not supported. - </para> - </listitem> - </itemizedlist> - </sect3> - - <sect3> - <title>Template Haskell</title> - <itemizedlist> - <listitem> - <para> - Template Haskell now supports Roles. - </para> - </listitem> - <listitem> - <para> - Template Haskell now supports annotation pragmas. - </para> - </listitem> - <listitem> - <para> - Typed Template Haskell expressions are now supported. See - <xref linkend="template-haskell"/> for more details. - </para> - </listitem> - <listitem> - <para> - Template Haskell declarations, types, patterns, and - <emphasis>untyped</emphasis> expressions are no longer - typechecked at all. This is a backwards-compatible change - since it allows strictly more programs to be typed. - </para> - </listitem> - </itemizedlist> - </sect3> - - <sect3> - <title>Runtime system</title> - <itemizedlist> - <listitem> - <para> - The RTS linker can now unload object code at - runtime (when using the GHC API - <literal>ObjLink</literal> module.) Previously, - GHC would not unload the old object file, causing - a gradual memory leak as more objects were loaded - over time. - </para> - - <para> - Note that this change in unloading behavior - <emphasis>only</emphasis> affects statically - linked binaries, and not dynamic ones. - </para> - </listitem> - - <listitem> - <para> - The performance of <literal>StablePtr</literal>s and - <literal>StableName</literal>s has been improved. - </para> - </listitem> - - <listitem> - <para> - The default maximum stack size has - increased. Previously, it defaulted to 8m - (equivalent to passing <literal>+RTS - -K8m</literal>. Now, GHC will use up-to 80% of the - <emphasis>physical memory</emphasis> available at - runtime. - </para> - </listitem> - </itemizedlist> - </sect3> - - <sect3> - <title>Build system</title> - <itemizedlist> - <listitem> - <para> - GHC >= 7.4 is now required for bootstrapping. - </para> - </listitem> - <listitem> - <para> - GHC can now be built with Clang, and use Clang as - the preprocessor for Haskell code. Only Clang - version 3.4 (or Apple LLVM Clang 5.0) or beyond is - reliably supported. - </para> - - <para> - Note that normally, GHC uses - <literal>gcc</literal> as the preprocessor for - Haskell code, which implements - <literal>-traditional</literal> behavior. However, - Clang is not 100% compatible with GCC's - <literal>-traditional</literal> as it is rather - implementation specified, and is more strict. - </para> - - <para> - As a result of this, when using Clang as the - preprocessor, some programs which previously used - <literal>-XCPP</literal> and the preprocessor will - now fail to compile. Users who wish to retain the - previous behavior are better off using cpphs. - </para> - </listitem> - </itemizedlist> - </sect3> - </sect2> - - <sect2> - <title>Libraries</title> - - <sect3> - <title>array</title> - <itemizedlist> - <listitem> - <para> - Version number 0.5.0.0 (was 0.4.0.1) - </para> - </listitem> - </itemizedlist> - </sect3> - - <sect3> - <title>base</title> - <itemizedlist> - <listitem> - <para> - Version number 4.7.0.0 (was 4.6.0.1) - </para> - </listitem> - <listitem> - <para> - The <literal>Control.Category</literal> module now has the - <literal>PolyKinds</literal> extension enabled, meaning - that instances of <literal>Category</literal> no longer - need be of kind <literal>* -> * -> *</literal>. - </para> - </listitem> - <listitem> - <para> - There are now <literal>Foldable</literal> and <literal>Traversable</literal> - instances for <literal>Either a</literal>, <literal>Const r</literal>, and <literal>(,) a</literal>. - </para> - </listitem> - <listitem> - <para> - There is now a <literal>Monoid</literal> instance for <literal>Const</literal>. - </para> - </listitem> - <listitem> - <para> - There is now a <literal>Data</literal> instance for <literal>Data.Version</literal>. - </para> - </listitem> - <listitem> - <para> - There are now <literal>Data</literal>, - <literal>Typeable</literal>, and - <literal>Generic</literal> instances for the types - in <literal>Data.Monoid</literal> and - <literal>Control.Applicative</literal> - </para> - </listitem> - <listitem> - <para> - There are now <literal>Num</literal> instances for <literal>Data.Monoid.Product</literal> and <literal>Data.Monoid.Sum</literal> - </para> - </listitem> - <listitem> - <para> - There are now <literal>Eq</literal>, <literal>Ord</literal>, <literal>Show</literal> and <literal>Read</literal> instances for <literal>ZipList</literal>. - </para> - </listitem> - <listitem> - <para> - There are now <literal>Eq</literal>, <literal>Ord</literal>, <literal>Show</literal> and <literal>Read</literal> instances for <literal>Down</literal>. - </para> - </listitem> - <listitem> - <para> - There are now <literal>Eq</literal>, <literal>Ord</literal>, <literal>Show</literal>, <literal>Read</literal> and <literal>Generic</literal> instances for types in GHC.Generics (<literal>U1</literal>, <literal>Par1</literal>, <literal>Rec1</literal>, <literal>K1</literal>, <literal>M1</literal>, <literal>(:+:)</literal>, <literal>(:*:)</literal>, <literal>(:.:)</literal>). - </para> - </listitem> - <listitem> - <para> - A zero-width unboxed poly-kinded <literal>Proxy#</literal> - was added to <literal>GHC.Prim</literal>. It can be used to make it so - that there is no the operational overhead for passing around proxy - arguments to model type application. - </para> - </listitem> - <listitem> - <para> - <literal>Control.Concurrent.MVar</literal> has a new - implementation of <literal>readMVar</literal>, which - fixes a long-standing bug where - <literal>readMVar</literal> is only atomic if there - are no other threads running - <literal>putMVar</literal>. - <literal>readMVar</literal> now is atomic, and is - guaranteed to return the value from the first - <literal>putMVar</literal>. There is also a new <literal>tryReadMVar</literal> - which is a non-blocking version. - </para> - </listitem> - <listitem> - <para> - There are now byte endian-swapping primitives - available in <literal>Data.Word</literal>, which - use optimized machine instructions when available. - </para> - </listitem> - <listitem> - <para> - <literal>Data.Bool</literal> now exports - <literal>bool :: a -> a -> Bool -> a</literal>, analogously - to <literal>maybe</literal> and <literal>either</literal> - in their respective modules. - </para> - </listitem> - <listitem> - <para> - Rewrote portions of <literal>Text.Printf</literal>, and - made changes to <literal>Numeric</literal> (added - <literal>Numeric.showFFloatAlt</literal> and - <literal>Numeric.showGFloatAlt</literal>) and - <literal>GHC.Float</literal> (added - <literal>formatRealFloatAlt</literal>) to support it. - The rewritten version is extensible to user types, adds a - "generic" format specifier "<literal>%v</literal>", - extends the <literal>printf</literal> spec - to support much of C's <literal>printf(3)</literal> - functionality, and fixes the spurious warnings about - using <literal>Text.Printf.printf</literal> at - <literal>(IO a)</literal> while ignoring the return value. - These changes were contributed by Bart Massey. - </para> - </listitem> - <listitem> - <para> - The minimal complete definitions for all - type-classes with cyclic default implementations - have been explicitly annotated with the new - <literal>{-# MINIMAL #-}</literal> pragma. - </para> - </listitem> - <listitem> - <para> - <literal>Control.Applicative.WrappedMonad</literal>, - which can be used to convert a <literal>Monad</literal> - to an <literal>Applicative</literal>, has now - a <literal>Monad m => Monad (WrappedMonad m)</literal> - instance. - </para> - </listitem> - </itemizedlist> - </sect3> - - <sect3> - <title>bin-package-db</title> - <itemizedlist> - <listitem> - <para> - This is an internal package, and should not be used. - </para> - </listitem> - </itemizedlist> - </sect3> - - <sect3> - <title>binary</title> - <itemizedlist> - <listitem> - <para> - Version number 0.7.1.0 (was 0.5.1.1) - </para> - </listitem> - </itemizedlist> - </sect3> - - <sect3> - <title>bytestring</title> - <itemizedlist> - <listitem> - <para> - Version number 0.10.4.0 (was 0.10.0.0) - </para> - </listitem> - </itemizedlist> - </sect3> - - <sect3> - <title>Cabal</title> - <itemizedlist> - <listitem> - <para> - Version number 1.18.1.3 (was 1.16.0) - </para> - </listitem> - </itemizedlist> - </sect3> - - <sect3> - <title>containers</title> - <itemizedlist> - <listitem> - <para> - Version number 0.5.4.0 (was 0.5.0.0) - </para> - </listitem> - </itemizedlist> - </sect3> - - <sect3> - <title>deepseq</title> - <itemizedlist> - <listitem> - <para> - Version number 1.3.0.2 (was 1.3.0.1) - </para> - </listitem> - </itemizedlist> - </sect3> - - <sect3> - <title>directory</title> - <itemizedlist> - <listitem> - <para> - Version number 1.2.0.2 (was 1.2.0.1) - </para> - </listitem> - <listitem> - <para> - The function <literal>findExecutables</literal> - now correctly checks to see if the execute bit is - set on Linux, rather than just looking in - <literal>$PATH</literal>. - </para> - </listitem> - <listitem> - <para> - There are several new functions for finding files, - including <literal>findFiles</literal> and - <literal>findFilesWith</literal>, which allow you - to search for a file given a set of filepaths, and - run a predicate over them. - </para> - </listitem> - </itemizedlist> - </sect3> - - <sect3> - <title>filepath</title> - <itemizedlist> - <listitem> - <para> - Version number 1.3.0.2 (was 1.3.0.1) - </para> - </listitem> - </itemizedlist> - </sect3> - - <sect3> - <title>ghc-prim</title> - <itemizedlist> - <listitem> - <para> - Version number 0.3.1.0 (was 0.3.0.0) - </para> - </listitem> - <listitem> - <para> - The type-classes <literal>Eq</literal> and - <literal>Ord</literal> have been annotated with - the new <literal>{-# MINIMAL #-}</literal> - pragma. - </para> - </listitem> - <listitem> - <para> - There is a new type exposed by - <literal>GHC.Types</literal>, called - <literal>SPEC</literal>, which can be used to - inform GHC to perform call-pattern specialisation - extremely aggressively. See <xref - linkend="options-optimise"/> for more details - concerning <literal>-fspec-constr</literal>. - </para> - </listitem> - </itemizedlist> - </sect3> - - <sect3> - <title>haskell98</title> - <itemizedlist> - <listitem> - <para> - Version number 2.0.0.3 (was 2.0.0.2) - </para> - </listitem> - </itemizedlist> - </sect3> - - <sect3> - <title>haskell2010</title> - <itemizedlist> - <listitem> - <para> - Version number 1.1.1.1 (was 1.1.1.0) - </para> - </listitem> - </itemizedlist> - </sect3> - - <sect3> - <title>hoopl</title> - <itemizedlist> - <listitem> - <para> - Version number 3.10.0.0 (was 3.9.0.0) - </para> - </listitem> - </itemizedlist> - </sect3> - - <sect3> - <title>hpc</title> - <itemizedlist> - <listitem> - <para> - Version number 0.6.0.1 (was 0.6.0.0) - </para> - </listitem> - </itemizedlist> - </sect3> - - <sect3> - <title>integer-gmp</title> - <itemizedlist> - <listitem> - <para> - Version number 0.5.1.0 (was 0.5.0.0) - </para> - </listitem> - </itemizedlist> - </sect3> - - <sect3> - <title>old-locale</title> - <itemizedlist> - <listitem> - <para> - Version number 1.0.0.6 (was 1.0.0.5) - </para> - </listitem> - </itemizedlist> - </sect3> - - <sect3> - <title>old-time</title> - <itemizedlist> - <listitem> - <para> - Version number 1.1.0.2 (was 1.1.0.1) - </para> - </listitem> - </itemizedlist> - </sect3> - - <sect3> - <title>process</title> - <itemizedlist> - <listitem> - <para> - Version number 1.2.0.0 (was 1.1.0.2) - </para> - </listitem> - <listitem> - <para> - Several bugs have been fixed, including deadlocks - in <literal>readProcess</literal> and - <literal>readProcessWithExitCode</literal>. - </para> - </listitem> - </itemizedlist> - </sect3> - - <sect3> - <title>template-haskell</title> - <itemizedlist> - <listitem> - <para> - Version number 2.9.0.0 (was 2.8.0.0) - </para> - </listitem> - <listitem> - <para> - Typed Template Haskell expressions are now - supported. See <xref linkend="template-haskell"/> - for more details. - </para> - </listitem> - <listitem> - <para> - There is now support for roles. - </para> - </listitem> - <listitem> - <para> - There is now support for annotation pragmas. - </para> - </listitem> - </itemizedlist> - </sect3> - - <sect3> - <title>time</title> - <itemizedlist> - <listitem> - <para> - Version number 1.4.1 (was 1.4.1) - </para> - </listitem> - </itemizedlist> - </sect3> - - <sect3> - <title>unix</title> - <itemizedlist> - <listitem> - <para> - Version number 2.7.0.0 (was 2.6.0.0) - </para> - </listitem> - <listitem> - <para> - A crash in <literal>getGroupEntryForID</literal> - (and related functions like - <literal>getUserEntryForID</literal> and - <literal>getUserEntryForName</literal>) in - multi-threaded applications has been fixed. - </para> - </listitem> - <listitem> - <para> - The functions <literal>getGroupEntryForID</literal> - and <literal>getUserEntryForID</literal> now fail - with a <literal>isDoesNotExist</literal> error when - the specified ID cannot be found. - </para> - </listitem> - </itemizedlist> - </sect3> - - <sect3> - <title>Win32</title> - <itemizedlist> - <listitem> - <para> - Version number 2.3.0.1 (was 2.3.0.0) - </para> - </listitem> - </itemizedlist> - </sect3> - </sect2> - - <sect2> - <title>Known bugs</title> - <itemizedlist> - <listitem> - <para> - On OS X Mavericks, when using Clang as the C - preprocessor, Haddock has a bug that causes it to fail - to generate documentation, with an error similar to - the following: - -<programlisting> -<no location info>: - module 'xhtml-3000.2.1:Main' is defined in multiple files: dist-bindist/build/tmp-72252/Text/XHtml.hs - dist-bindist/build/tmp-72252/Text/XHtml/Frameset.hs - dist-bindist/build/tmp-72252/Text/XHtml/Strict.hs - dist-bindist/build/tmp-72252/Text/XHtml/Transitional.hs -... -</programlisting> - - </para> - <para> - This only affects certain packages. This is due to a - bad interaction with Clang, which we hope to resolve - soon. - </para> - <para> - Note that when using <literal>cabal-install</literal>, - this only effects the package documentation, not - installation or building. - </para> - </listitem> - <listitem> - <para> - On OS X 10.7 and beyond, with default build settings, - the runtime system currently suffers from a fairly - large (approx. 30%) performance regression in the - parallel garbage collector when using - <literal>-threaded</literal>. - </para> - <para> - This is due to the fact that the OS X 10.7+ toolchain - does not (by default) support register variables, or a - fast <literal>__thread</literal> implementation. Note - that this can be worked around by building GHC using - GCC instead on OS X platforms, but the binary - distribution then requires GCC later. - </para> - </listitem> - - <listitem> - <para> - On Windows, <literal>-dynamic-too</literal> is unsupported. - </para> - </listitem> - - <listitem> - <para> - On Windows, we currently don't ship dynamic libraries - or use a dynamic GHCi, unlike Linux, FreeBSD or OS X. - </para> - </listitem> - </itemizedlist> - </sect2> -</sect1> diff --git a/docs/users_guide/codegens.xml b/docs/users_guide/codegens.xml index 2eb9408c6c..d2a805a3ee 100644 --- a/docs/users_guide/codegens.xml +++ b/docs/users_guide/codegens.xml @@ -38,7 +38,7 @@ <para>You must install and have LLVM available on your PATH for the LLVM code generator to work. Specifically GHC needs to be able to call the - <command>opt</command>and <command>llc</command> tools. Secondly, if you + <command>opt</command> and <command>llc</command> tools. Secondly, if you are running Mac OS X with LLVM 3.0 or greater then you also need the <ulink url="http://clang.llvm.org">Clang c compiler</ulink> compiler available on your PATH. diff --git a/docs/users_guide/external_core.xml b/docs/users_guide/external_core.xml deleted file mode 100644 index e4354410ef..0000000000 --- a/docs/users_guide/external_core.xml +++ /dev/null @@ -1,1804 +0,0 @@ -<?xml version="1.0" encoding="utf-8"?> - -<!-- -This document is a semi-automatic conversion of docs/ext-core/core.tex to DocBook using - -1. `htlatex` to convert LaTeX to HTML -2. `pandoc` to convert HTML to DocBook -3. extensive manual work by James H. Fisher (jameshfisher@gmail.com) ---> - -<!-- -TODO: - -* Replace "java" programlisting with "ghccore" -("ghccore" is not recognized by highlighters, -causing some generators to fail). - -* Complete bibliography entries with journal titles; -I am unsure of the proper DocBook elements. - -* Integrate this file with the rest of the Users' Guide. ---> - - -<chapter id="an-external-representation-for-the-ghc-core-language-for-ghc-6.10"> - <title>An External Representation for the GHC Core Language (For GHC 6.10)</title> - - <para>Andrew Tolmach, Tim Chevalier ({apt,tjc}@cs.pdx.edu) and The GHC Team</para> - - <para>This chapter provides a precise definition for the GHC Core - language, so that it can be used to communicate between GHC and new - stand-alone compilation tools such as back-ends or - optimizers.<footnote> - <para>This is a draft document, which attempts - to describe GHC’s current behavior as precisely as possible. Working - notes scattered throughout indicate areas where further work is - needed. Constructive comments are very welcome, both on the - presentation, and on ways in which GHC could be improved in order to - simplify the Core story.</para> - - <para>Support for generating external Core (post-optimization) was - originally introduced in GHC 5.02. The definition of external Core in - this document reflects the version of external Core generated by the - HEAD (unstable) branch of GHC as of May 3, 2008 (version 6.9), using - the compiler flag <code>-fext-core</code>. We expect that GHC 6.10 will be - consistent with this definition.</para> - </footnote> - The definition includes a formal grammar and an informal semantics. - An executable typechecker and interpreter (in Haskell), which - formally embody the static and dynamic semantics, are available - separately.</para> - - <section id="introduction"> - <title>Introduction</title> - - <para>The Glasgow Haskell Compiler (GHC) uses an intermediate language, - called <quote>Core,</quote> as its internal program representation within the - compiler’s simplification phase. Core resembles a subset of - Haskell, but with explicit type annotations in the style of the - polymorphic lambda calculus (F<subscript>ω</subscript>).</para> - - <para>GHC’s front-end translates full Haskell 98 (plus some extensions) - into Core. The GHC optimizer then repeatedly transforms Core - programs while preserving their meaning. A <quote>Core Lint</quote> pass in GHC - typechecks Core in between transformation passes (at least when - the user enables linting by setting a compiler flag), verifying - that transformations preserve type-correctness. Finally, GHC’s - back-end translates Core into STG-machine code <citation>stg-machine</citation> and then into C - or native code.</para> - - <para>Two existing papers discuss the original rationale for the design - and use of Core <citation>ghc-inliner,comp-by-trans-scp</citation>, although the (two different) idealized - versions of Core described therein differ in significant ways from - the actual Core language in current GHC. In particular, with the - advent of GHC support for generalized algebraic datatypes (GADTs) - <citation>gadts</citation> Core was extended beyond its previous - F<subscript>ω</subscript>-style incarnation to support type - equality constraints and safe coercions, and is now based on a - system known as F<subscript>C</subscript> <citation>system-fc</citation>.</para> - - <para>Researchers interested in writing just <emphasis>part</emphasis> of a Haskell compiler, - such as a new back-end or a new optimizer pass, might like to use - GHC to provide the other parts of the compiler. For example, they - might like to use GHC’s front-end to parse, desugar, and - type-check source Haskell, then feeding the resulting code to - their own back-end tool. As another example, they might like to - use Core as the target language for a front-end compiler of their - own design, feeding externally synthesized Core into GHC in order - to take advantage of GHC’s optimizer, code generator, and run-time - system. Without external Core, there are two ways for compiler - writers to do this: they can link their code into the GHC - executable, which is an arduous process, or they can use the GHC - API <citation>ghc-api</citation> to do the same task more cleanly. Both ways require new - code to be written in Haskell.</para> - - <para>We present a precisely specified external format for Core files. - The external format is text-based and human-readable, to promote - interoperability and ease of use. We hope this format will make it - easier for external developers to use GHC in a modular way.</para> - - <para>It has long been true that GHC prints an ad-hoc textual - representation of Core if you set certain compiler flags. But this - representation is intended to be read by people who are debugging - the compiler, not by other programs. Making Core into a - machine-readable, bi-directional communication format requires: - - <orderedlist> - <listitem> - precisely specifying the external format of Core; - </listitem> - <listitem> - modifying GHC to generate external Core files - (post-simplification; as always, users can control the exact - transformations GHC does with command-line flags); - </listitem> - <listitem> - modifying GHC to accept external Core files in place of - Haskell source files (users will also be able to control what - GHC does to those files with command-line flags). - </listitem> - </orderedlist> - </para> - - <para>The first two facilities will let developers couple GHC’s - front-end (parser, type-checker, desugarer), and optionally its - optimizer, with new back-end tools. The last facility will let - developers write new Core-to-Core transformations as an external - tool and integrate them into GHC. It will also allow new - front-ends to generate Core that can be fed into GHC’s optimizer - or back-end.</para> - - <para>However, because there are many (undocumented) idiosyncracies in - the way GHC produces Core from source Haskell, it will be hard for - an external tool to produce Core that can be integrated with - GHC-produced Core (e.g., for the Prelude), and we don’t aim to - support this. Indeed, for the time being, we aim to support only - the first two facilities and not the third: we define and - implement Core as an external format that GHC can use to - communicate with external back-end tools, and defer the larger - task of extending GHC to support reading this external format back - in.</para> - - <para>This document addresses the first requirement, a formal Core - definition, by proposing a formal grammar for an - <link linkend="external-grammar-of-core">external representation of Core</link>, - and an <link linkend="informal-semantics">informal semantics</link>.</para> - - <para>GHC supports many type system extensions; the External Core - printer built into GHC only supports some of them. However, - External Core should be capable of representing any Haskell 98 - program, and may be able to represent programs that require - certain type system extensions as well. If a program uses - unsupported features, GHC may fail to compile it to Core when the - -fext-core flag is set, or GHC may successfully compile it to - Core, but the external tools will not be able to typecheck or - interpret it.</para> - - <para>Formal static and dynamic semantics in the form of an executable - typechecker and interpreter are available separately in the GHC - source tree - <footnote><ulink url="http://git.haskell.org/ghc.git/tree">http://git.haskell.org/ghc.git</ulink></footnote> - under <code>utils/ext-core</code>.</para> - - </section> - <section id="external-grammar-of-core"> - <title>External Grammar of Core</title> - - <para>In designing the external grammar, we have tried to strike a - balance among a number of competing goals, including easy - parseability by machines, easy readability by humans, and adequate - structural simplicity to allow straightforward presentations of - the semantics. Thus, we had to make some compromises. - Specifically:</para> - - <itemizedlist> - <listitem>In order to avoid explosion of parentheses, we support - standard precedences and short-cuts for expressions, types, - and kinds. Thus we had to introduce multiple non-terminals for - each of these syntactic categories, and as a result, the - concrete grammar is longer and more complex than the - underlying abstract syntax.</listitem> - - <listitem>On the other hand, we have kept the grammar simpler by - avoiding special syntax for tuple types and terms. Tuples - (both boxed and unboxed) are treated as ordinary constructors.</listitem> - - <listitem>All type abstractions and applications are given in full, even - though some of them (e.g., for tuples) could be reconstructed; - this means a parser for Core does not have to reconstruct - types.<footnote> - These choices are certainly debatable. In - particular, keeping type applications on tuples and case arms - considerably increases the size of Core files and makes them less - human-readable, though it allows a Core parser to be simpler. - </footnote></listitem> - - <listitem>The syntax of identifiers is heavily restricted (to just - alphanumerics and underscores); this again makes Core easier - to parse but harder to read.</listitem> - </itemizedlist> - - <para>We use the following notational conventions for syntax: - - <informaltable frame="none"> - <tgroup cols='2' align='left' colsep="0" rowsep="0"> - <tbody> - <row> - <entry>[ pat ]</entry> - <entry>optional</entry> - </row> - - <row> - <entry>{ pat }</entry> - <entry>zero or more repetitions</entry> - </row> - - <row> - <entry> - { pat }<superscript>+</superscript> - </entry> - <entry>one or more repetitions</entry> - </row> - - <row> - <entry> - pat<subscript>1</subscript> ∣ pat<subscript>2</subscript> - </entry> - <entry>choice</entry> - </row> - - <row> - <entry> - <code>fibonacci</code> - </entry> - <entry>terminal syntax in typewriter font</entry> - </row> - </tbody> - </tgroup> - </informaltable> - </para> - - <informaltable frame="none" colsep="0" rowsep="0"> - <tgroup cols='5'> - <colspec colname="cat" align="left" colwidth="3*" /> - <colspec colname="lhs" align="right" colwidth="2*" /> - <colspec align="center" colwidth="*" /> - <colspec colname="rhs" align="left" colwidth="10*" /> - <colspec colname="name" align="right" colwidth="6*" /> - <tbody> - <row rowsep="1"> - <entry>Module</entry> - <entry>module</entry> - <entry>→</entry> - <entry> - <code>%module</code> mident { tdef ; }{ vdefg ; } - </entry> - <entry></entry> - </row> - - <row> - <entry morerows="1" valign="top">Type defn.</entry> - <entry morerows="1" valign="top">tdef</entry> - <entry>→</entry> - <entry> - <code>%data</code> qtycon { tbind } <code>= {</code> [ cdef {<code>;</code> cdef } ] <code>}</code> - </entry> - <entry>algebraic type</entry> - </row> - <row rowsep="1"> - <entry>∣</entry> - <entry> - <code>%newtype</code> qtycon qtycon { tbind } <code>=</code> ty - </entry> - <entry>newtype</entry> - </row> - - <row rowsep="1"> - <entry>Constr. defn.</entry> - <entry>cdef</entry> - <entry>→</entry> - <entry> - qdcon { <code>@</code> tbind }{ aty }<superscript>+</superscript> - </entry> - </row> - - <row> - <entry morerows="2" valign="top">Value defn.</entry> - <entry morerows="1" valign="top">vdefg</entry> - <entry>→</entry> - <entry><code>%rec {</code> vdef { <code>;</code> vdef } <code>}</code></entry> - <entry>recursive</entry> - </row> - - <row> - <entry>∣</entry> - <entry>vdef</entry> - <entry>non-recursive</entry> - </row> - - <row rowsep="1"> - <entry>vdef</entry> - <entry>→</entry> - <entry>qvar <code>::</code> ty <code>=</code> exp</entry> - <entry></entry> - </row> - - <row> - <entry morerows="3" valign="top">Atomic expr.</entry> - <entry morerows="3" valign="top">aexp</entry> - <entry>→</entry> - <entry>qvar</entry> - <entry>variable</entry> - </row> - - <row> - <entry>∣</entry> - <entry>qdcon</entry> - <entry>data constructor</entry> - </row> - - <row> - <entry>∣</entry> - <entry>lit</entry> - <entry>literal</entry> - </row> - - <row rowsep="1"> - <entry>∣</entry> - <entry><code>(</code> exp <code>)</code></entry> - <entry>nested expr.</entry> - </row> - - <row> - <entry morerows="9" valign="top">Expression</entry> - <entry morerows="9" valign="top">exp</entry> - <entry>→</entry> - <entry>aexp</entry> - <entry>atomic expresion</entry> - </row> - - <row> - <entry>∣</entry> - <entry>aexp { arg }<superscript>+</superscript></entry> - <entry>application</entry> - </row> - - <row> - <entry>∣</entry> - <entry><code>\</code> { binder }<superscript>+</superscript> &arw; exp</entry> - <entry>abstraction</entry> - </row> - - <row> - <entry>∣</entry> - <entry><code>%let</code> vdefg <code>%in</code> exp</entry> - <entry>local definition</entry> - </row> - - <row> - <entry>∣</entry> - <entry><code>%case (</code> aty <code>)</code> exp <code>%of</code> vbind <code>{</code> alt { <code>;</code> alt } <code>}</code></entry> - <entry>case expression</entry> - </row> - - <row> - <entry>∣</entry> - <entry><code>%cast</code> exp aty</entry> - <entry>type coercion</entry> - </row> - - <row> - <entry>∣</entry> - <entry><code>%note</code> " { char } " exp</entry> - <entry>expression note</entry> - </row> - - <row> - <entry>∣</entry> - <entry><code>%external ccall "</code> { char } <code>"</code> aty</entry> - <entry>external reference</entry> - </row> - - <row> - <entry>∣</entry> - <entry><code>%dynexternal ccall</code> aty</entry> - <entry>external reference (dynamic)</entry> - </row> - - <row rowsep="1"> - <entry>∣</entry> - <entry><code>%label "</code> { char } <code>"</code></entry> - <entry>external label</entry> - </row> - - <row> - <entry morerows="1" valign="top">Argument</entry> - <entry morerows="1" valign="top">arg</entry> - <entry>→</entry> - <entry><code>@</code> aty</entry> - <entry>type argument</entry> - </row> - - <row rowsep="1"> - <entry>∣</entry> - <entry>aexp</entry> - <entry>value argument</entry> - </row> - - <row> - <entry morerows="2" valign="top">Case alt.</entry> - <entry morerows="2" valign="top">alt</entry> - <entry>→</entry> - <entry>qdcon { <code>@</code> tbind }{ vbind } <code>&arw;</code> exp</entry> - <entry>constructor alternative</entry> - </row> - - <row> - <entry>∣</entry> - <entry>lit <code>&arw;</code> exp</entry> - <entry>literal alternative</entry> - </row> - - <row rowsep="1"> - <entry>∣</entry> - <entry><code>%_ &arw;</code> exp</entry> - <entry>default alternative</entry> - </row> - - <row> - <entry morerows="1" valign="top">Binder</entry> - <entry morerows="1" valign="top">binder</entry> - <entry>→</entry> - <entry><code>@</code> tbind</entry> - <entry>type binder</entry> - </row> - - <row rowsep="1"> - <entry>∣</entry> - <entry>vbind</entry> - <entry>value binder</entry> - </row> - - <row> - <entry morerows="1" valign="top">Type binder</entry> - <entry morerows="1" valign="top">tbind</entry> - <entry>→</entry> - <entry>tyvar</entry> - <entry>implicitly of kind *</entry> - </row> - - <row rowsep="1"> - <entry>∣</entry> - <entry><code>(</code> tyvar <code>::</code> kind <code>)</code></entry> - <entry>explicitly kinded</entry> - </row> - - <row rowsep="1"> - <entry>Value binder</entry> - <entry>vbind</entry> - <entry>→</entry> - <entry><code>(</code> var <code>::</code> ty <code>)</code></entry> - <entry></entry> - </row> - - <row> - <entry morerows="3" valign="top">Literal</entry> - <entry morerows="3" valign="top">lit</entry> - <entry>→</entry> - <entry><code>(</code> [<code>-</code>] { digit }<superscript>+</superscript> <code>::</code> ty <code>)</code></entry> - <entry>integer</entry> - </row> - - <row> - <entry>∣</entry> - <entry><code>(</code> [<code>-</code>] { digit }<superscript>+</superscript> <code>%</code> { digit }<superscript>+</superscript> <code>::</code> ty <code>)</code></entry> - <entry>rational</entry> - </row> - - <row> - <entry>∣</entry> - <entry><code>( '</code> char <code>' ::</code> ty <code>)</code></entry> - <entry>character</entry> - </row> - - <row rowsep="1"> - <entry>∣</entry> - <entry><code>( "</code> { char } <code>" ::</code> ty <code>)</code></entry> - <entry>string</entry> - </row> - - <row> - <entry morerows="2" valign="top">Character</entry> - <entry morerows="1" valign="top">char</entry> - <entry>→</entry> - <entry namest="rhs" nameend="name"><emphasis>any ASCII character in range 0x20-0x7E except 0x22,0x27,0x5c</emphasis></entry> - </row> - - <row> - <entry>∣</entry> - <entry><code>\x</code> hex hex</entry> - <entry>ASCII code escape sequence</entry> - </row> - - <row rowsep="1"> - <entry>hex</entry> - <entry>→</entry> - <entry>0∣…∣9 ∣a ∣…∣f</entry> - <entry></entry> - </row> - - <row> - <entry morerows="2" valign="top">Atomic type</entry> - <entry morerows="2" valign="top">aty</entry> - <entry>→</entry> - <entry>tyvar</entry> - <entry>type variable</entry> - </row> - - <row> - <entry>∣</entry> - <entry>qtycon</entry> - <entry>type constructor</entry> - </row> - - <row rowsep="1"> - <entry>∣</entry> - <entry><code>(</code> ty <code>)</code></entry> - <entry>nested type</entry> - </row> - - <row> - <entry morerows="7" valign="top">Basic type</entry> - <entry morerows="7" valign="top">bty</entry> - <entry>→</entry> - <entry>aty</entry> - <entry>atomic type</entry> - </row> - - <row> - <entry>∣</entry> - <entry>bty aty</entry> - <entry>type application</entry> - </row> - - <row> - <entry>∣</entry> - <entry><code>%trans</code> aty aty</entry> - <entry>transitive coercion</entry> - </row> - - <row> - <entry>∣</entry> - <entry><code>%sym</code> aty</entry> - <entry>symmetric coercion</entry> - </row> - - <row> - <entry>∣</entry> - <entry><code>%unsafe</code> aty aty</entry> - <entry>unsafe coercion</entry> - </row> - - <row> - <entry>∣</entry> - <entry><code>%left</code> aty</entry> - <entry>left coercion</entry> - </row> - - <row> - <entry>∣</entry> - <entry><code>%right</code> aty</entry> - <entry>right coercion</entry> - </row> - - <row rowsep="1"> - <entry>∣</entry> - <entry><code>%inst</code> aty aty</entry> - <entry>instantiation coercion</entry> - </row> - - <row> - <entry morerows="2" valign="top">Type</entry> - <entry morerows="2" valign="top">ty</entry> - <entry>→</entry> - <entry>bty</entry> - <entry>basic type</entry> - </row> - - <row> - <entry>∣</entry> - <entry><code>%forall</code> { tbind }<superscript>+</superscript> <code>.</code> ty</entry> - <entry>type abstraction</entry> - </row> - - <row rowsep="1"> - <entry>∣</entry> - <entry>bty <code>&arw;</code> ty</entry> - <entry>arrow type construction</entry> - </row> - - <row> - <entry morerows="4" valign="top">Atomic kind</entry> - <entry morerows="4" valign="top">akind</entry> - <entry>→</entry> - <entry><code>*</code></entry> - <entry>lifted kind</entry> - </row> - - <row> - <entry>∣</entry> - <entry><code>#</code></entry> - <entry>unlifted kind</entry> - </row> - - <row> - <entry>∣</entry> - <entry><code>?</code></entry> - <entry>open kind</entry> - </row> - - <row> - <entry>∣</entry> - <entry>bty <code>:=:</code> bty</entry> - <entry>equality kind</entry> - </row> - - <row rowsep="1"> - <entry>∣</entry> - <entry><code>(</code> kind <code>)</code></entry> - <entry>nested kind</entry> - </row> - - <row> - <entry morerows="1" valign="top">Kind</entry> - <entry morerows="1" valign="top">kind</entry> - <entry>→</entry> - <entry>akind</entry> - <entry>atomic kind</entry> - </row> - - <row rowsep="1"> - <entry>∣</entry> - <entry>akind <code>&arw;</code> kind</entry> - <entry>arrow kind</entry> - </row> - - <row> - <entry morerows="7" valign="top">Identifier</entry> - <entry>mident</entry> - <entry>→</entry> - <entry>pname <code>:</code> uname</entry> - <entry>module</entry> - </row> - - <row> - <entry>tycon</entry> - <entry>→</entry> - <entry>uname</entry> - <entry>type constr.</entry> - </row> - - <row> - <entry>qtycon</entry> - <entry>→</entry> - <entry>mident <code>.</code> tycon</entry> - <entry>qualified type constr.</entry> - </row> - - <row> - <entry>tyvar</entry> - <entry>→</entry> - <entry>lname</entry> - <entry>type variable</entry> - </row> - - <row> - <entry>dcon</entry> - <entry>→</entry> - <entry>uname</entry> - <entry>data constr.</entry> - </row> - - <row> - <entry>qdcon</entry> - <entry>→</entry> - <entry>mident <code>.</code> dcon</entry> - <entry>qualified data constr.</entry> - </row> - - <row> - <entry>var</entry> - <entry>→</entry> - <entry>lname</entry> - <entry>variable</entry> - </row> - - <row rowsep="1"> - <entry>qvar</entry> - <entry>→</entry> - <entry>[ mident <code>.</code> ] var</entry> - <entry>optionally qualified variable</entry> - </row> - - <row> - <entry morerows="6" valign="top">Name</entry> - <entry>lname</entry> - <entry>→</entry> - <entry>lower { namechar }</entry> - <entry></entry> - </row> - - <row> - <entry>uname</entry> - <entry>→</entry> - <entry>upper { namechar }</entry> - <entry></entry> - </row> - - <row> - <entry>pname</entry> - <entry>→</entry> - <entry>{ namechar }<superscript>+</superscript></entry> - <entry></entry> - </row> - - <row> - <entry>namechar</entry> - <entry>→</entry> - <entry>lower ∣ upper ∣ digit</entry> - <entry></entry> - </row> - - <row> - <entry>lower</entry> - <entry>→</entry> - <entry><code>a</code> ∣ <code>b</code> ∣ … ∣ <code>z</code> ∣ <code>_</code></entry> - <entry></entry> - </row> - - <row> - <entry>upper</entry> - <entry>→</entry> - <entry><code>A</code> ∣ <code>B</code> ∣ … ∣ <code>Z</code></entry> - <entry></entry> - </row> - - <row> - <entry>digit</entry> - <entry>→</entry> - <entry><code>0</code> ∣ <code>1</code> ∣ … ∣ <code>9</code></entry> - <entry></entry> - </row> - </tbody> - </tgroup> - </informaltable> - </section> - - <section id="informal-semantics"> - <title>Informal Semantics</title> - - <para>At the term level, Core resembles a explicitly-typed polymorphic - lambda calculus (F<subscript>ω</subscript>), with the addition of - local <code>let</code> bindings, algebraic type definitions, constructors, and - <code>case</code> expressions, and primitive types, literals and operators. Its - type system is richer than that of System F, supporting explicit - type equality coercions and type functions.<citation>system-fc</citation></para> - - <para>In this section we concentrate on the less obvious points about - Core.</para> - - <section id="program-organization-and-modules"> - <title>Program Organization and Modules</title> - - <para>Core programs are organized into <emphasis>modules</emphasis>, corresponding directly - to source-level Haskell modules. Each module has a identifying - name <emphasis>mident</emphasis>. A module identifier consists of a <emphasis>package name</emphasis> - followed by a module name, which may be hierarchical: for - example, <code>base:GHC.Base</code> is the module identifier for GHC’s Base - module. Its name is <code>Base</code>, and it lives in the GHC hierarchy - within the <code>base</code> package. Section 5.8 of the GHC users’ guide - explains package names <citation>ghc-user-guide</citation>. In particular, note that a Core - program may contain multiple modules with the same (possibly - hierarchical) module name that differ in their package names. In - some of the code examples that follow, we will omit package - names and possibly full hierarchical module names from - identifiers for brevity, but be aware that they are always - required.<footnote> - A possible improvement to the Core syntax - would be to add explicit import lists to Core modules, which could be - used to specify abbrevations for long qualified names. This would make - the code more human-readable. - </footnote></para> - - <para>Each module may contain the following kinds of top-level - declarations: - - <itemizedlist> - <listitem> - Algebraic data type declarations, each defining a type - constructor and one or more data constructors; - </listitem> - <listitem> - Newtype declarations, corresponding to Haskell <code>newtype</code> - declarations, each defining a type constructor and a - coercion name; and - </listitem> - <listitem> - Value declarations, defining the types and values of - top-level variables. - </listitem> - </itemizedlist> - </para> - - <para>No type constructor, data constructor, or top-level value may be - declared more than once within a given module. All the type - declarations are (potentially) mutually recursive. Value - declarations must be in dependency order, with explicit grouping - of potentially mutually recursive declarations.</para> - - <para>Identifiers defined in top-level declarations may be <emphasis>external</emphasis> or - <emphasis>internal</emphasis>. External identifiers can be referenced from any other - module in the program, using conventional dot notation (e.g., - <code>base:GHC.Base.Bool</code>, <code>base:GHC.Base.True</code>). Internal identifiers - are visible only within the defining module. All type and data - constructors are external, and are always defined and referenced - using fully qualified names (with dots).</para> - - <para>A top-level value is external if it is defined and referenced - using a fully qualified name with a dot (e.g., <code>main:MyModule.foo = ...</code>); - otherwise, it is internal (e.g., <code>bar = ...</code>). Note that - Core’s notion of an external identifier does not necessarily - coincide with that of <quote>exported</quote> identifier in a Haskell source - module. An identifier can be an external identifier in Core, but - not be exported by the original Haskell source - module.<footnote> - Two examples of such identifiers are: data - constructors, and values that potentially appear in an unfolding. For an - example of the latter, consider <code>Main.foo = ... Main.bar ...</code>, where - <code>Main.foo</code> is inlineable. Since <code>bar</code> appears in <code>foo</code>’s unfolding, it is - defined and referenced with an external name, even if <code>bar</code> was not - exported by the original source module. - </footnote> - However, if an identifier was exported by the Haskell source - module, it will appear as an external name in Core.</para> - - <para>Core modules have no explicit import or export lists. Modules - may be mutually recursive. Note that because of the latter fact, - GHC currently prints out the top-level bindings for every module - as a single recursive group, in order to avoid keeping track of - dependencies between top-level values within a module. An - external Core tool could reconstruct dependencies later, of - course.</para> - - <para>There is also an implicitly-defined module <code>ghc-prim:GHC.Prim</code>, - which exports the <quote>built-in</quote> types and values that must be - provided by any implementation of Core (including GHC). Details - of this module are in the <link linkend="primitive-module">Primitive Module section</link>.</para> - - <para>A Core <emphasis>program</emphasis> is a collection of distinctly-named modules that - includes a module called main:Main having an exported value - called <code>main:ZCMain.main</code> of type <code>base:GHC.IOBase.IO a</code> (for some - type <code>a</code>). (Note that the strangely named wrapper for <code>main</code> is the - one exception to the rule that qualified names defined within a - module <code>m</code> must have module name <code>m</code>.)</para> - - <para>Many Core programs will contain library modules, such as - <code>base:GHC.Base</code>, which implement parts of the Haskell standard - library. In principle, these modules are ordinary Haskell - modules, with no special status. In practice, the requirement on - the type of <code>main:Main.main</code> implies that every program will - contain a large subset of the standard library modules.</para> - - </section> - <section id="namespaces"> - <title>Namespaces</title> - - <para>There are five distinct namespaces: - <orderedlist> - <listitem>module identifiers (<code>mident</code>),</listitem> - <listitem>type constructors (<code>tycon</code>),</listitem> - <listitem>type variables (<code>tyvar</code>),</listitem> - <listitem>data constructors (<code>dcon</code>),</listitem> - <listitem>term variables (<code>var</code>).</listitem> - </orderedlist> - </para> - - <para>Spaces (1), (2+3), and (4+5) can be distinguished from each - other by context. To distinguish (2) from (3) and (4) from (5), - we require that data and type constructors begin with an - upper-case character, and that term and type variables begin - with a lower-case character.</para> - - <para>Primitive types and operators are not syntactically - distinguished.</para> - - <para>Primitive <emphasis>coercion</emphasis> operators, of which there are six, <emphasis>are</emphasis> - syntactically distinguished in the grammar. This is because - these coercions must be fully applied, and because - distinguishing their applications in the syntax makes - typechecking easier.</para> - - <para>A given variable (type or term) may have multiple definitions - within a module. However, definitions of term variables never - <quote>shadow</quote> one another: the scope of the definition of a given - variable never contains a redefinition of the same variable. - Type variables may be shadowed. Thus, if a term variable has - multiple definitions within a module, all those definitions must - be local (let-bound). The only exception to this rule is that - (necessarily closed) types labelling <code>%external</code> expressions may - contain <code>tyvar</code> bindings that shadow outer bindings.</para> - - <para>Core generated by GHC makes heavy use of encoded names, in which - the characters <code>Z</code> and <code>z</code> are used to introduce escape sequences - for non-alphabetic characters such as dollar sign <code>$</code> (<code>zd</code>), hash <code>#</code> - (<code>zh</code>), plus <code>+</code> (<code>zp</code>), etc. This is the same encoding used in <code>.hi</code> - files and in the back-end of GHC itself, except that we - sometimes change an initial <code>z</code> to <code>Z</code>, or vice-versa, in order to - maintain case distinctions.</para> - - <para>Finally, note that hierarchical module names are z-encoded in - Core: for example, <code>base:GHC.Base.foo</code> is rendered as - <code>base:GHCziBase.foo</code>. A parser may reconstruct the module - hierarchy, or regard <code>GHCziBase</code> as a flat name.</para> - - </section> - <section id="types-and-kinds"> - <title>Types and Kinds</title> - - <para>In Core, all type abstractions and applications are explicit. - This make it easy to typecheck any (closed) fragment of Core - code. An full executable typechecker is available separately.</para> - - <section id="types"> - <title>Types</title> - - <para>Types are described by type expressions, which are built from - named type constructors and type variables using type - application and universal quantification. Each type - constructor has a fixed arity ≥ 0. Because it is so widely - used, there is special infix syntax for the fully-applied - function type constructor (<code>&arw;</code>). (The prefix identifier for - this constructor is <code>ghc-prim:GHC.Prim.ZLzmzgZR</code>; this should - only appear in unapplied or partially applied form.)</para> - - <para>There are also a number of other primitive type constructors - (e.g., <code>Intzh</code>) that are predefined in the <code>GHC.Prim</code> module, but - have no special syntax. <code>%data</code> and <code>%newtype</code> declarations - introduce additional type constructors, as described below. - Type constructors are distinguished solely by name.</para> - - </section> - <section id="coercions"> - <title>Coercions</title> - - <para>A type may also be built using one of the primitive coercion - operators, as described in <link linkend="namespaces">the Namespaces section</link>. For details on the - meanings of these operators, see the System FC paper <citation>system-fc</citation>. Also - see <link linkend="newtypes">the Newtypes section</link> for - examples of how GHC uses coercions in Core code.</para> - - </section> - <section id="kinds"> - <title>Kinds</title> - <para>As described in the Haskell definition, it is necessary to - distinguish well-formed type-expressions by classifying them - into different <emphasis>kinds</emphasis> <citation>haskell98, p. 41</citation><!-- TODO -->. In particular, Core - explicitly records the kind of every bound type variable.</para> - - <para>In addition, Core’s kind system includes equality kinds, as in - System FC <citation>system-fc</citation>. An application of a built-in coercion, or of a - user-defined coercion as introduced by a <code>newtype</code> declaration, - has an equality kind.</para> - - </section> - <section id="lifted-and-unlifted-types"> - <title>Lifted and Unlifted Types</title> - - <para>Semantically, a type is <emphasis>lifted</emphasis> if and only if it has bottom as - an element. We need to distinguish them because operationally, - terms with lifted types may be represented by closures; terms - with unlifted types must not be represented by closures, which - implies that any unboxed value is necessarily unlifted. We - distinguish between lifted and unlifted types by ascribing - them different kinds.</para> - - <para>Currently, all the primitive types are unlifted (including a - few boxed primitive types such as <code>ByteArrayzh</code>). Peyton-Jones - and Launchbury <citation>pj:unboxed</citation> described the ideas behind unboxed and - unlifted types.</para> - - </section> - <section id="type-constructors-base-kinds-and-higher-kinds"> - <title>Type Constructors; Base Kinds and Higher Kinds</title> - - <para>Every type constructor has a kind, depending on its arity and - whether it or its arguments are lifted.</para> - - <para>Term variables can only be assigned types that have base - kinds: the base kinds are <code>*</code>, <code>#</code>, and <code>?</code>. The three base kinds - distinguish the liftedness of the types they classify: <code>*</code> - represents lifted types; <code>#</code> represents unlifted types; and <code>?</code> is - the <quote>open</quote> kind, representing a type that may be either lifted - or unlifted. Of these, only <code>*</code> ever appears in Core type - declarations generated from user code; the other two are - needed to describe certain types in primitive (or otherwise - specially-generated) code (which, after optimization, could - potentially appear anywhere).</para> - - <para>In particular, no top-level identifier (except in - <code>ghc-prim:GHC.Prim</code>) has a type of kind <code>#</code> or <code>?</code>.</para> - - <para>Nullary type constructors have base kinds: for example, the - type <code>Int</code> has kind <code>*</code>, and <code>Int#</code> has kind <code>#</code>.</para> - - <para>Non-nullary type constructors have higher kinds: kinds that - have the form - k<subscript>1</subscript><code>&arw;</code>k<subscript>2</subscript>, where - k<subscript>1</subscript> and k<subscript>2</subscript> are - kinds. For example, the function type constructor <code>&arw;</code> has - kind <code>* &arw; (* &arw; *)</code>. Since Haskell allows abstracting - over type constructors, type variables may have higher kinds; - however, much more commonly they have kind <code>*</code>, so that is the - default if a type binder omits a kind.</para> - - </section> - - <section id="type-synonyms-and-type-equivalence"> - <title>Type Synonyms and Type Equivalence</title> - - <para>There is no mechanism for defining type synonyms - (corresponding to Haskell <code>type</code> declarations).</para> - - <para>Type equivalence is just syntactic equivalence on type - expressions (of base kinds) modulo:</para> - - <itemizedlist> - <listitem>alpha-renaming of variables bound in <code>%forall</code> types;</listitem> - <listitem>the identity a <code>&arw;</code> b ≡ <code>ghc-prim:GHC.Prim.ZLzmzgZR</code> a b</listitem> - </itemizedlist> - - </section> - </section> - <section id="algebraic-data-types"> - <title>Algebraic data types</title> - - <para>Each data declaration introduces a new type constructor and a - set of one or more data constructors, normally corresponding - directly to a source Haskell <code>data</code> declaration. For example, the - source declaration - - <programlisting language="haskell"> -data Bintree a = - Fork (Bintree a) (Bintree a) - | Leaf a - </programlisting> - - might induce the following Core declaration - - <programlisting language="java"> -%data Bintree a = { - Fork (Bintree a) (Bintree a); - Leaf a)} - </programlisting> - - which introduces the unary type constructor Bintree of kind - <code>*&arw;*</code> and two data constructors with types - - <programlisting language="java"> -Fork :: %forall a . Bintree a &arw; Bintree a &arw; Bintree a -Leaf :: %forall a . a &arw; Bintree a - </programlisting> - - We define the <emphasis>arity</emphasis> of each data constructor to be the number of - value arguments it takes; e.g. <code>Fork</code> has arity 2 and <code>Leaf</code> has - arity 1.</para> - - <para>For a less conventional example illustrating the possibility of - higher-order kinds, the Haskell source declaration - - <programlisting language="haskell"> -data A f a = MkA (f a) - </programlisting> - - might induce the Core declaration - - <programlisting language="java"> -%data A (f::*&arw;*) a = { MkA (f a) } - </programlisting> - - which introduces the constructor - - <programlisting language="java"> -MkA :: %forall (f::*&arw;*) a . (f a) &arw; (A f) a - </programlisting></para> - - <para>GHC (like some other Haskell implementations) supports an - extension to Haskell98 for existential types such as - - <programlisting language="haskell"> -data T = forall a . MkT a (a &arw; Bool) - </programlisting> - - This is represented by the Core declaration - - <programlisting language="java"> -%data T = {MkT @a a (a &arw; Bool)} - </programlisting> - - which introduces the nullary type constructor T and the data - constructor - - <programlisting language="java"> -MkT :: %forall a . a &arw; (a &arw; Bool) &arw; T - </programlisting> - - In general, existentially quantified variables appear as extra - universally quantified variables in the data contructor types. An - example of how to construct and deconstruct values of type <code>T</code> is - shown in <link linkend="expression-forms">the Expression Forms section</link>.</para> - - </section> - <section id="newtypes"> - <title>Newtypes</title> - - <para>Each Core <code>%newtype</code> declaration introduces a new type constructor - and an associated representation type, corresponding to a source - Haskell <code>newtype</code> declaration. However, unlike in source Haskell, - a <code>%newtype</code> declaration does not introduce any data constructors.</para> - - <para>Each <code>%newtype</code> declaration also introduces a new coercion - (syntactically, just another type constructor) that implies an - axiom equating the type constructor, applied to any type - variables bound by the <code>%newtype</code>, to the representation type.</para> - - <para>For example, the Haskell fragment - - <programlisting language="haskell"> -newtype U = MkU Bool -u = MkU True -v = case u of - MkU b &arw; not b - </programlisting> - - might induce the Core fragment - - <programlisting language="java"> -%newtype U ZCCoU = Bool; -u :: U = %cast (True) - ((%sym ZCCoU)); -v :: Bool = not (%cast (u) ZCCoU); - </programlisting></para> - - <para>The <code>newtype</code> declaration implies that the types <code>U</code> and <code>Bool</code> have - equivalent representations, and the coercion axiom <code>ZCCoU</code> - provides evidence that <code>U</code> is equivalent to <code>Bool</code>. Notice that in - the body of <code>u</code>, the boolean value <code>True</code> is cast to type <code>U</code> using - the primitive symmetry rule applied to <code>ZCCoU</code>: that is, using a - coercion of kind <code>Bool :=: U</code>. And in the body of <code>v</code>, <code>u</code> is cast - back to type <code>Bool</code> using the axiom <code>ZCCoU</code>.</para> - - <para>Notice that the <code>case</code> in the Haskell source code above translates - to a <code>cast</code> in the corresponding Core code. That is because - operationally, a <code>case</code> on a value whose type is declared by a - <code>newtype</code> declaration is a no-op. Unlike a <code>case</code> on any other - value, such a <code>case</code> does no evaluation: its only function is to - coerce its scrutinee’s type.</para> - - <para>Also notice that unlike in a previous draft version of External - Core, there is no need to handle recursive newtypes specially.</para> - - </section> - - <section id="expression-forms"> - <title>Expression Forms</title> - - <para>Variables and data constructors are straightforward.</para> - - <para>Literal (<emphasis role="variable">lit</emphasis>) expressions consist of a literal value, in one of - four different formats, and a (primitive) type annotation. Only - certain combinations of format and type are permitted; - see <link linkend="primitive-module">the Primitive Module section</link>. - The character and string formats can describe only 8-bit ASCII characters.</para> - - <para>Moreover, because the operational semantics for Core interprets - strings as C-style null-terminated strings, strings should not - contain embedded nulls.</para> - - <para>In Core, value applications, type applications, value - abstractions, and type abstractions are all explicit. To tell - them apart, type arguments in applications and formal type - arguments in abstractions are preceded by an <code>@ symbol</code>. (In - abstractions, the <code>@</code> plays essentially the same role as the more - usual Λ symbol.) For example, the Haskell source declaration - - <programlisting language="haskell"> -f x = Leaf (Leaf x) - </programlisting> - - might induce the Core declaration - - <programlisting language="java"> -f :: %forall a . a &arw; BinTree (BinTree a) = - \ @a (x::a) &arw; Leaf @(Bintree a) (Leaf @a x) - </programlisting></para> - - <para>Value applications may be of user-defined functions, data - constructors, or primitives. None of these sorts of applications - are necessarily saturated.</para> - - <para>Note that the arguments of type applications are not always of - kind <code>*</code>. For example, given our previous definition of type <code>A</code>: - - <programlisting language="haskell"> -data A f a = MkA (f a) - </programlisting> - - the source code - - <programlisting language="haskell"> -MkA (Leaf True) - </programlisting> - - becomes - - <programlisting language="java"> -(MkA @Bintree @Bool) (Leaf @Bool True) - </programlisting></para> - - <para>Local bindings, of a single variable or of a set of mutually - recursive variables, are represented by <code>%let</code> expressions in the - usual way.</para> - - <para>By far the most complicated expression form is <code>%case</code>. <code>%case</code> - expressions are permitted over values of any type, although they - will normally be algebraic or primitive types (with literal - values). Evaluating a <code>%case</code> forces the evaluation of the - expression being tested (the <quote>scrutinee</quote>). The value of the - scrutinee is bound to the variable following the <code>%of</code> keyword, - which is in scope in all alternatives; this is useful when the - scrutinee is a non-atomic expression (see next example). The - scrutinee is preceded by the type of the entire <code>%case</code> - expression: that is, the result type that all of the <code>%case</code> - alternatives have (this is intended to make type reconstruction - easier in the presence of type equality coercions).</para> - - <para>In an algebraic <code>%case</code>, all the case alternatives must be labeled - with distinct data constructors from the algebraic type, - followed by any existential type variable bindings (see below), - and typed term variable bindings corresponding to the data - constructor’s arguments. The number of variables must match the - data constructor’s arity.</para> - - <para>For example, the following Haskell source expression - - <programlisting language="haskell"> -case g x of - Fork l r &arw; Fork r l - t@(Leaf v) &arw; Fork t t - </programlisting> - - might induce the Core expression - - <programlisting language="java"> -%case ((Bintree a)) g x %of (t::Bintree a) - Fork (l::Bintree a) (r::Bintree a) &arw; - Fork @a r l - Leaf (v::a) &arw; - Fork @a t t - </programlisting></para> - - <para>When performing a <code>%case</code> over a value of an - existentially-quantified algebraic type, the alternative must - include extra local type bindings for the - existentially-quantified variables. For example, given - - <programlisting language="haskell"> -data T = forall a . MkT a (a &arw; Bool) - </programlisting> - - the source - - <programlisting language="haskell"> -case x of - MkT w g &arw; g w - </programlisting> - - becomes - - <programlisting language="java"> -%case x %of (x’::T) - MkT @b (w::b) (g::b&arw;Bool) &arw; g w - </programlisting></para> - - <para>In a <code>%case</code> over literal alternatives, all the case alternatives - must be distinct literals of the same primitive type.</para> - - <para>The list of alternatives may begin with a default alternative - labeled with an underscore (<code>%_</code>), whose right-hand side will be - evaluated if none of the other alternatives match. The default - is optional except for in a case over a primitive type, or when - there are no other alternatives. If the case is over neither an - algebraic type nor a primitive type, then the list of - alternatives must contain a default alternative and nothing - else. For algebraic cases, the set of alternatives need not be - exhaustive, even if no default is given; if alternatives are - missing, this implies that GHC has deduced that they cannot - occur.</para> - - <para><code>%cast</code> is used to manipulate newtypes, as described in - <link linkend="newtypes">the Newtype section</link>. The <code>%cast</code> expression - takes an expression and a coercion: syntactically, the coercion - is an arbitrary type, but it must have an equality kind. In an - expression <code>(cast e co)</code>, if <code>e :: T</code> and <code>co</code> has kind <code>T :=: U</code>, then - the overall expression has type <code>U</code> <citation>ghc-fc-commentary</citation>. Here, <code>co</code> must be a - coercion whose left-hand side is <code>T</code>.</para> - - <para>Note that unlike the <code>%coerce</code> expression that existed in previous - versions of Core, this means that <code>%cast</code> is (almost) type-safe: - the coercion argument provides evidence that can be verified by - a typechecker. There are still unsafe <code>%cast</code>s, corresponding to - the unsafe <code>%coerce</code> construct that existed in old versions of - Core, because there is a primitive unsafe coercion type that can - be used to cast arbitrary types to each other. GHC uses this for - such purposes as coercing the return type of a function (such as - error) which is guaranteed to never return: - - <programlisting language="haskell"> -case (error "") of - True &arw; 1 - False &arw; 2 - </programlisting> - - becomes: - - <programlisting language="java"> -%cast (error @ Bool (ZMZN @ Char)) -(%unsafe Bool Integer); - </programlisting> - - <code>%cast</code> has no operational meaning and is only used in - typechecking.</para> - - <para>A <code>%note</code> expression carries arbitrary internal information that - GHC finds interesting. The information is encoded as a string. - Expression notes currently generated by GHC include the inlining - pragma (<code>InlineMe</code>) and cost-center labels for profiling.</para> - - <para>A <code>%external</code> expression denotes an external identifier, which has - the indicated type (always expressed in terms of Haskell - primitive types). External Core supports two kinds of external - calls: <code>%external</code> and <code>%dynexternal</code>. Only the former is supported - by the current set of stand-alone Core tools. In addition, there - is a <code>%label</code> construct which GHC may generate but which the Core - tools do not support.</para> - - <para>The present syntax for externals is sufficient for describing C - functions and labels. Interfacing to other languages may require - additional information or a different interpretation of the name - string.</para> - - </section> - - <section id="expression-evaluation"> - <title>Expression Evaluation</title> - <para>The dynamic semantics of Core are defined on the type-erasure of - the program: for example, we ignore all type abstractions and - applications. The denotational semantics of the resulting - type-free program are just the conventional ones for a - call-by-name language, in which expressions are only evaluated - on demand. But Core is intended to be a call-by-<emphasis>need</emphasis> language, - in which expressions are only evaluated once. To express the - sharing behavior of call-by-need, we give an operational model - in the style of Launchbury <citation>launchbury93natural</citation>.</para> - - <para>This section describes the model informally; a more formal - semantics is separately available as an executable interpreter.</para> - - <para>To simplify the semantics, we consider only <quote>well-behaved</quote> Core - programs in which constructor and primitive applications are - fully saturated, and in which non-trivial expresssions of - unlifted kind (<code>#</code>) appear only as scrutinees in <code>%case</code> - expressions. Any program can easily be put into this form; a - separately available preprocessor illustrates how. In the - remainder of this section, we use <quote>Core</quote> to mean <quote>well-behaved</quote> - Core.</para> - - <para>Evaluating a Core expression means reducing it to <emphasis>weak-head normal form (WHNF)</emphasis>, - i.e., a primitive value, lambda abstraction, - or fully-applied data constructor. Evaluating a program means - evaluating the expression <code>main:ZCMain.main</code>.</para> - - <para>To make sure that expression evaluation is shared, we make use - of a <emphasis>heap</emphasis>, which contains <emphasis>heap entries</emphasis>. A heap entry can be: - - <itemizedlist> - <listitem> - A <emphasis>thunk</emphasis>, representing an unevaluated expression, also known - as a suspension. - </listitem> - <listitem> - A <emphasis>WHNF</emphasis>, representing an evaluated expression. The result of - evaluating a thunk is a WHNF. A WHNF is always a closure - (corresponding to a lambda abstraction in the source - program) or a data constructor application: computations - over primitive types are never suspended. - </listitem> - </itemizedlist></para> - - <para><emphasis>Heap pointers</emphasis> point to heap entries: at different times, the - same heap pointer can point to either a thunk or a WHNF, because - the run-time system overwrites thunks with WHNFs as computation - proceeds.</para> - - <para>The suspended computation that a thunk represents might - represent evaluating one of three different kinds of expression. - The run-time system allocates a different kind of thunk - depending on what kind of expression it is: - - <itemizedlist> - <listitem> - A thunk for a value definition has a group of suspended - defining expressions, along with a list of bindings between - defined names and heap pointers to those suspensions. (A - value definition may be a recursive group of definitions or - a single non-recursive definition, and it may be top-level - (global) or <code>let</code>-bound (local)). - </listitem> - <listitem> - A thunk for a function application (where the function is - user-defined) has a suspended actual argument expression, - and a binding between the formal argument and a heap pointer - to that suspension. - </listitem> - <listitem> - A thunk for a constructor application has a suspended actual - argument expression; the entire constructed value has a heap - pointer to that suspension embedded in it. - </listitem> - </itemizedlist></para> - - <para>As computation proceeds, copies of the heap pointer for a given - thunk propagate through the executing program. When another - computation demands the result of that thunk, the thunk is - <emphasis>forced</emphasis>: the run-time system computes the thunk’s result, - yielding a WHNF, and overwrites the heap entry for the thunk - with the WHNF. Now, all copies of the heap pointer point to the - new heap entry: a WHNF. Forcing occurs only in the context of - - <itemizedlist> - <listitem>evaluating the operator expression of an application;</listitem> - <listitem>evaluating the scrutinee of a <code>case</code> expression; or</listitem> - <listitem>evaluating an argument to a primitive or external function application</listitem> - </itemizedlist> - </para> - - <para>When no pointers to a heap entry (whether it is a thunk or WHNF) - remain, the garbage collector can reclaim the space it uses. We - assume this happens implicitly.</para> - - <para>With the exception of functions, arrays, and mutable variables, - we intend that values of all primitive types should be held - <emphasis>unboxed</emphasis>: they should not be heap-allocated. This does not - violate call-by-need semantics: all primitive types are - <emphasis>unlifted</emphasis>, which means that values of those types must be - evaluated strictly. Unboxed tuple types are not heap-allocated - either.</para> - - <para>Certain primitives and <code>%external</code> functions cause side-effects to - state threads or to the real world. Where the ordering of these - side-effects matters, Core already forces this order with data - dependencies on the pseudo-values representing the threads.</para> - - <para>An implementation must specially support the <code>raisezh</code> and - <code>handlezh</code> primitives: for example, by using a handler stack. - Again, real-world threading guarantees that they will execute in - the correct order.</para> - - </section> - </section> - <section id="primitive-module"> - <title>Primitive Module</title> - - <para>The semantics of External Core rely on the contents and informal - semantics of the primitive module <code>ghc-prim:GHC.Prim</code>. Nearly all - the primitives are required in order to cover GHC’s implementation - of the Haskell98 standard prelude; the only operators that can be - completely omitted are those supporting the byte-code interpreter, - parallelism, and foreign objects. Some of the concurrency - primitives are needed, but can be given degenerate implementations - if it desired to target a purely sequential backend (see Section - <link linkend="non-concurrent-back-end">the Non-concurrent Back End section</link>).</para> - - <para>In addition to these primitives, a large number of C library - functions are required to implement the full standard Prelude, - particularly to handle I/O and arithmetic on less usual types.</para> - - <para>For a full listing of the names and types of the primitive - operators, see the GHC library documentation <citation>ghcprim</citation>.</para> - - <section id="non-concurrent-back-end"> - <title>Non-concurrent Back End</title> - - <para>The Haskell98 standard prelude doesn’t include any concurrency - support, but GHC’s implementation of it relies on the existence - of some concurrency primitives. However, it never actually forks - multiple threads. Hence, the concurrency primitives can be given - degenerate implementations that will work in a non-concurrent - setting, as follows:</para> - - <itemizedlist> - <listitem> - <code>ThreadIdzh</code> can be represented by a singleton type, whose - (unique) value is returned by <code>myThreadIdzh</code>. - </listitem> - <listitem> - <code>forkzh</code> can just die with an <quote>unimplemented</quote> message. - </listitem> - <listitem> - <code>killThreadzh</code> and <code>yieldzh</code> can also just die <quote>unimplemented</quote> - since in a one-thread world, the only thread a thread can - kill is itself, and if a thread yields the program hangs. - </listitem> - <listitem> - <code>MVarzh a</code> can be represented by <code>MutVarzh (Maybe a)</code>; where a - concurrent implementation would block, the sequential - implementation can just die with a suitable message (since - no other thread exists to unblock it). - </listitem> - <listitem> - <code>waitReadzh</code> and <code>waitWritezh</code> can be implemented using a <code>select</code> - with no timeout. - </listitem> - </itemizedlist> - </section> - - <section id="literals"> - <title>Literals</title> - - <para>Only the following combination of literal forms and types are - permitted:</para> - - <informaltable frame="none" colsep="0" rowsep="0"> - <tgroup cols='3'> - <colspec colname="literal" align="left" colwidth="*" /> - <colspec colname="type" align="left" colwidth="*" /> - <colspec colname="description" align="left" colwidth="4*" /> - <thead> - <row> - <entry>Literal form</entry> - <entry>Type</entry> - <entry>Description</entry> - </row> - </thead> - <tbody> - <row> - <entry morerows="3" valign="top">integer</entry> - <entry><code>Intzh</code></entry> - <entry>Int</entry> - </row> - <row> - <entry><code>Wordzh</code></entry> - <entry>Word</entry> - </row> - <row> - <entry><code>Addrzh</code></entry> - <entry>Address</entry> - </row> - <row> - <entry><code>Charzh</code></entry> - <entry>Unicode character code</entry> - </row> - - <row> - <entry morerows="1" valign="top">rational</entry> - <entry><code>Floatzh</code></entry> - <entry>Float</entry> - </row> - <row> - <entry><code>Doublezh</code></entry> - <entry>Double</entry> - </row> - - <row> - <entry>character</entry> - <entry><code>Charzh</code></entry> - <entry>Unicode character specified by ASCII character</entry> - </row> - - <row> - <entry>string</entry> - <entry><code>Addrzh</code></entry> - <entry>Address of specified C-format string</entry> - </row> - </tbody> - </tgroup> - </informaltable> - </section> - </section> - - - <bibliolist> - <!-- This bibliography was semi-automatically converted by JabRef from core.bib. --> - - <title>References</title> - - <biblioentry> - <abbrev>ghc-user-guide</abbrev> - <authorgroup> - <author><surname>The GHC Team</surname></author> - </authorgroup> - <citetitle pubwork="article">The Glorious Glasgow Haskell Compilation System User's Guide, Version 6.8.2</citetitle> - <pubdate>2008</pubdate> - <bibliomisc><ulink url="http://www.haskell.org/ghc/docs/latest/html/users_guide/index.html">http://www.haskell.org/ghc/docs/latest/html/users_guide/index.html</ulink></bibliomisc> - </biblioentry> - - <biblioentry> - <abbrev>ghc-fc-commentary</abbrev> - <authorgroup> - <author><surname>GHC Wiki</surname></author> - </authorgroup> - <citetitle pubwork="article">System FC: equality constraints and coercions</citetitle> - <pubdate>2006</pubdate> - <bibliomisc><ulink url="http://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/FC">http://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/FC</ulink></bibliomisc> - </biblioentry> - - <biblioentry> - <abbrev>ghc-api</abbrev> - <authorgroup> - <author><surname>Haskell Wiki</surname></author> - </authorgroup> - <citetitle pubwork="article">Using GHC as a library</citetitle> - <pubdate>2007</pubdate> - <bibliomisc><ulink url="http://haskell.org/haskellwiki/GHC/As_a_library">http://haskell.org/haskellwiki/GHC/As_a_library</ulink></bibliomisc> - </biblioentry> - - <biblioentry> - <abbrev>haskell98</abbrev> - <authorgroup> - <editor><firstname>Simon</firstname><surname>Peyton-Jones</surname></editor> - </authorgroup> - <citetitle pubwork="article">Haskell 98 Language and Libraries: The Revised Report</citetitle> - <publisher> - <publishername>Cambridge University Press</publishername> - <address> - <city>Cambridge></city> - <state>UK</state> - </address> - </publisher> - <pubdate>2003</pubdate> - </biblioentry> - - <biblioentry> - <abbrev>system-fc</abbrev> - <authorgroup> - <author><firstname>Martin</firstname><surname>Sulzmann</surname></author> - <author><firstname>Manuel M.T.</firstname><surname>Chakravarty</surname></author> - <author><firstname>Simon</firstname><surname>Peyton-Jones</surname></author> - <author><firstname>Kevin</firstname><surname>Donnelly</surname></author> - </authorgroup> - <citetitle pubwork="article">System F with type equality coercions</citetitle> - <publisher> - <publishername>ACM</publishername> - <address> - <city>New York</city> - <state>NY</state> - <country>USA</country> - </address> - </publisher> - <artpagenums>53-66</artpagenums> - <pubdate>2007</pubdate> - <bibliomisc><ulink url="http://portal.acm.org/citation.cfm?id=1190324">http://portal.acm.org/citation.cfm?id=1190324</ulink></bibliomisc> - <!-- booktitle = {{TLDI '07: Proceedings of the 2007 ACM SIGPLAN International Workshop on Types in Language Design and Implementation}}, --> - </biblioentry> - - <biblioentry> - <abbrev>gadts</abbrev> - <authorgroup> - <author><firstname>Simon</firstname><surname>Peyton-Jones</surname></author> - <author><firstname>Dimitrios</firstname><surname>Vytiniotis</surname></author> - <author><firstname>Stephanie</firstname><surname>Weirich</surname></author> - <author><firstname>Geoffrey</firstname><surname>Washburn</surname></author> - </authorgroup> - <citetitle pubwork="article">Simple unification-based type inference for GADTs</citetitle> - <publisher> - <publishername>ACM</publishername> - <address> - <city>New York</city> - <state>NY</state> - <country>USA</country> - </address> - </publisher> - <artpagenums>50-61</artpagenums> - <pubdate>2006</pubdate> - <bibliomisc><ulink url="http://research.microsoft.com/Users/simonpj/papers/gadt/index.htm">http://research.microsoft.com/Users/simonpj/papers/gadt/index.htm</ulink></bibliomisc> - </biblioentry> - - <biblioentry> - <abbrev>Launchbury94</abbrev> - <authorgroup> - <author><firstname>John</firstname><surname>Launchbury</surname></author> - <author><firstname>Simon L.</firstname><surname>Peyton-Jones</surname></author> - </authorgroup> - <citetitle pubwork="article">Lazy Functional State Threads</citetitle> - <artpagenums>24-35</artpagenums> - <pubdate>1994</pubdate> - <bibliomisc><ulink url="http://citeseer.ist.psu.edu/article/launchbury93lazy.html">http://citeseer.ist.psu.edu/article/launchbury93lazy.html</ulink></bibliomisc> - <!-- booktitle = "{SIGPLAN} {Conference} on {Programming Language Design and Implementation}", --> - </biblioentry> - - <biblioentry> - <abbrev>pj:unboxed</abbrev> - <authorgroup> - <author><firstname>Simon L.</firstname><surname>Peyton-Jones</surname></author> - <author><firstname>John</firstname><surname>Launchbury</surname></author> - <editor><firstname>J.</firstname><surname>Hughes</surname></editor> - </authorgroup> - <citetitle pubwork="article">Unboxed Values as First Class Citizens in a Non-strict Functional Language</citetitle> - <publisher> - <publishername>Springer-Verlag LNCS523</publishername> - <address> - <city>Cambridge</city> - <state>Massachussetts</state> - <country>USA</country> - </address> - </publisher> - <artpagenums>636-666</artpagenums> - <pubdate>1991, August 26-28</pubdate> - <bibliomisc><ulink url="http://citeseer.ist.psu.edu/jones91unboxed.html">http://citeseer.ist.psu.edu/jones91unboxed.html</ulink></bibliomisc> - <!-- booktitle = "Proceedings of the Conference on Functional Programming and Computer Architecture", --> - </biblioentry> - - <biblioentry> - <abbrev>ghc-inliner</abbrev> - <authorgroup> - <author><firstname>Simon</firstname><surname>Peyton-Jones</surname></author> - <author><firstname>Simon</firstname><surname>Marlow</surname></author> - </authorgroup> - <citetitle pubwork="article">Secrets of the Glasgow Haskell Compiler inliner</citetitle> - <pubdate>1999</pubdate> - <address> - <city>Paris</city> - <country>France</country> - </address> - <bibliomisc><ulink url="http://research.microsoft.com/Users/simonpj/Papers/inlining/inline.pdf">http://research.microsoft.com/Users/simonpj/Papers/inlining/inline.pdf</ulink></bibliomisc> - <!-- booktitle = "Workshop on Implementing Declarative Languages", --> - </biblioentry> - - <biblioentry> - <abbrev>comp-by-trans-scp</abbrev> - <authorgroup> - <author><firstname>Simon L.</firstname><surname>Peyton-Jones</surname></author> - <author><firstname>A. L. M.</firstname><surname>Santos</surname></author> - </authorgroup> - <citetitle pubwork="article">A transformation-based optimiser for Haskell</citetitle> - <citetitle pubwork="journal">Science of Computer Programming</citetitle> - <volumenum>32</volumenum> - <issuenum>1-3</issuenum> - <artpagenums>3-47</artpagenums> - <pubdate>1998</pubdate> - <bibliomisc><ulink url="http://citeseer.ist.psu.edu/peytonjones98transformationbased.html">http://citeseer.ist.psu.edu/peytonjones98transformationbased.html</ulink></bibliomisc> - </biblioentry> - - <biblioentry> - <abbrev>stg-machine</abbrev> - <authorgroup> - <author><firstname>Simon L.</firstname><surname>Peyton-Jones</surname></author> - </authorgroup> - <citetitle pubwork="article">Implementing Lazy Functional Languages on Stock Hardware: The Spineless Tagless G-Machine</citetitle> - <citetitle pubwork="journal">Journal of Functional Programming</citetitle> - <volumenum>2</volumenum> - <issuenum>2</issuenum> - <artpagenums>127-202</artpagenums> - <pubdate>1992</pubdate> - <bibliomisc><ulink url="http://citeseer.ist.psu.edu/peytonjones92implementing.html">http://citeseer.ist.psu.edu/peytonjones92implementing.html</ulink></bibliomisc> - </biblioentry> - - <biblioentry> - <abbrev>launchbury93natural</abbrev> - <authorgroup> - <author><firstname>John</firstname><surname>Launchbury</surname></author> - </authorgroup> - <citetitle pubwork="article">A Natural Semantics for Lazy Evaluation</citetitle> - <artpagenums>144-154</artpagenums> - <address> - <city>Charleston</city> - <state>South Carolina</state> - </address> - <pubdate>1993</pubdate> - <bibliomisc><ulink url="http://citeseer.ist.psu.edu/launchbury93natural.html">http://citeseer.ist.psu.edu/launchbury93natural.html</ulink></bibliomisc> - <!-- booktitle = "Conference Record of the Twentieth Annual {ACM} {SIGPLAN}-{SIGACT} Symposium on Principles of Programming Languages", --> - </biblioentry> - - <biblioentry> - <abbrev>ghcprim</abbrev> - <authorgroup> - <author><surname>The GHC Team</surname></author> - </authorgroup> - <citetitle pubwork="article">Library documentation: GHC.Prim</citetitle> - <pubdate>2008</pubdate> - <bibliomisc><ulink url="http://www.haskell.org/ghc/docs/latest/html/libraries/base/GHC-Prim.html">http://www.haskell.org/ghc/docs/latest/html/libraries/base/GHC-Prim.html</ulink></bibliomisc> - </biblioentry> - </bibliolist> - -</chapter> diff --git a/docs/users_guide/flags.xml b/docs/users_guide/flags.xml index 593bf4b1ef..1dd224a611 100644 --- a/docs/users_guide/flags.xml +++ b/docs/users_guide/flags.xml @@ -705,6 +705,12 @@ </thead> <tbody> <row> + <entry><option>-fcontext-stack=N</option><replaceable>n</replaceable></entry> + <entry>set the <link linkend="undecidable-instances">limit for context reduction</link>. Default is 20.</entry> + <entry>dynamic</entry> + <entry></entry> + </row> + <row> <entry><option>-fglasgow-exts</option></entry> <entry>Deprecated. Enable most language extensions; see <xref linkend="options-language"/> for exactly which ones.</entry> <entry>dynamic</entry> @@ -717,10 +723,10 @@ <entry><option>-fno-irrefutable-tuples</option></entry> </row> <row> - <entry><option>-fcontext-stack=N</option><replaceable>n</replaceable></entry> - <entry>set the <link linkend="undecidable-instances">limit for context reduction</link>. Default is 20.</entry> + <entry><option>-fpackage-trust</option></entry> + <entry>Enable <link linkend="safe-haskell">Safe Haskell</link> trusted package requirement for trustworthy modules.</entry> <entry>dynamic</entry> - <entry></entry> + <entry><option>-</option></entry> </row> <row> <entry><option>-ftype-function-depth=N</option><replaceable>n</replaceable></entry> @@ -751,65 +757,168 @@ <entry><option>-XNoAutoDeriveTypeable</option></entry> </row> <row> + <entry><option>-XBangPatterns</option></entry> + <entry>Enable <link linkend="bang-patterns">bang patterns</link>.</entry> + <entry>dynamic</entry> + <entry><option>-XNoBangPatterns</option></entry> + </row> + <row> + <entry><option>-XBinaryLiterals</option></entry> + <entry>Enable support for <link linkend="binary-literals">binary literals</link>.</entry> + <entry>dynamic</entry> + <entry><option>-XNoBinaryLiterals</option></entry> + </row> + <row> + <entry><option>-XCApiFFI</option></entry> + <entry>Enable <link linkend="ffi-capi">the CAPI calling convention</link>.</entry> + <entry>dynamic</entry> + <entry><option>-XNoCAPIFFI</option></entry> + </row> + <row> + <entry><option>-XConstrainedClassMethods</option></entry> + <entry>Enable <link linkend="class-method-types">constrained class methods</link>.</entry> + <entry>dynamic</entry> + <entry><option>-XNoConstrainedClassMethods</option></entry> + </row> + <row> <entry><option>-XConstraintKinds</option></entry> <entry>Enable a <link linkend="constraint-kind">kind of constraints</link>.</entry> <entry>dynamic</entry> <entry><option>-XNoConstraintKinds</option></entry> </row> <row> + <entry><option>-XCPP</option></entry> + <entry>Enable the <link linkend="c-pre-processor">C preprocessor</link>.</entry> + <entry>dynamic</entry> + <entry><option>-XNoCPP</option></entry> + </row> + <row> <entry><option>-XDataKinds</option></entry> <entry>Enable <link linkend="promotion">datatype promotion</link>.</entry> <entry>dynamic</entry> <entry><option>-XNoDataKinds</option></entry> </row> <row> + <entry><option>-XDefaultSignatures</option></entry> + <entry>Enable <link linkend="class-default-signatures">default signatures</link>.</entry> + <entry>dynamic</entry> + <entry><option>-XNoDefaultSignatures</option></entry> + </row> + <row> <entry><option>-XDeriveDataTypeable</option></entry> - <entry>Enable <link linkend="deriving-typeable">deriving for the Data and Typeable classes</link>.</entry> + <entry>Enable <link linkend="deriving-typeable">deriving for the Data and Typeable classes</link>. + Implied by <option>-XAutoDeriveTypeable</option>.</entry> <entry>dynamic</entry> <entry><option>-XNoDeriveDataTypeable</option></entry> </row> <row> + <entry><option>-XDeriveFunctor</option></entry> + <entry>Enable <link linkend="deriving-extra">deriving for the Functor class</link>. + Implied by <option>-XDeriveTraversable</option>.</entry> + <entry>dynamic</entry> + <entry><option>-XNoDeriveFunctor</option></entry> + </row> + <row> + <entry><option>-XDeriveFoldable</option></entry> + <entry>Enable <link linkend="deriving-extra">deriving for the Foldable class</link>. + Implied by <option>-XDeriveTraversable</option>.</entry> + <entry>dynamic</entry> + <entry><option>-XNoDeriveFoldable</option></entry> + </row> + <row> <entry><option>-XDeriveGeneric</option></entry> <entry>Enable <link linkend="deriving-typeable">deriving for the Generic class</link>.</entry> <entry>dynamic</entry> <entry><option>-XNoDeriveGeneric</option></entry> </row> <row> - <entry><option>-XGeneralizedNewtypeDeriving</option></entry> - <entry>Enable <link linkend="newtype-deriving">newtype deriving</link>.</entry> + <entry><option>-XDeriveTraversable</option></entry> + <entry>Enable <link linkend="deriving-extra">deriving for the Traversable class</link>. + Implies <option>-XDeriveFunctor</option> and <option>-XDeriveFoldable</option>.</entry> <entry>dynamic</entry> - <entry><option>-XNoGeneralizedNewtypeDeriving</option></entry> + <entry><option>-XNoDeriveTraversable</option></entry> </row> <row> <entry><option>-XDisambiguateRecordFields</option></entry> - <entry>Enable <link linkend="disambiguate-fields">record - field disambiguation</link></entry> + <entry>Enable <link linkend="disambiguate-fields">record field disambiguation</link>. + Implied by <option>-XRecordWildCards</option>.</entry> <entry>dynamic</entry> <entry><option>-XNoDisambiguateRecordFields</option></entry> </row> <row> <entry><option>-XEmptyCase</option></entry> - <entry>Allow <link linkend="empty-case">empty case alternatives</link> - </entry> + <entry>Allow <link linkend="empty-case">empty case alternatives</link>.</entry> <entry>dynamic</entry> <entry><option>-XNoEmptyCase</option></entry> </row> <row> + <entry><option>-XEmptyDataDecls</option></entry> + <entry>Enable empty data declarations.</entry> + <entry>dynamic</entry> + <entry><option>-XNoEmptyDataDecls</option></entry> + </row> + <row> + <entry><option>-XExistentialQuantification</option></entry> + <entry>Enable <link linkend="existential-quantification">existential quantification</link>.</entry> + <entry>dynamic</entry> + <entry><option>-XNoExistentialQuantification</option></entry> + </row> + <row> + <entry><option>-XExplicitForAll</option></entry> + <entry>Enable <link linkend="explicit-foralls">explicit universal quantification</link>. + Implied by <option>-XScopedTypeVariables</option>, + <option>-XLiberalTypeSynonyms</option>, + <option>-XRankNTypes</option> and + <option>-XExistentialQuantification</option>. + </entry> + <entry>dynamic</entry> + <entry><option>-XNoExplicitForAll</option></entry> + </row> + <row> + <entry><option>-XExplicitNamespaces</option></entry> + <entry>Enable using the keyword <literal>type</literal> to specify the namespace of + entries in imports and exports (<xref linkend="explicit-namespaces"/>). + Implied by <option>-XTypeOperators</option> and <option>-XTypeFamilies</option>.</entry> + <entry>dynamic</entry> + <entry><option>-XNoExplicitNamespaces</option></entry> + </row> + <row> <entry><option>-XExtendedDefaultRules</option></entry> - <entry>Use GHCi's <link linkend="extended-default-rules">extended default rules</link> in a normal module</entry> + <entry>Use GHCi's <link linkend="extended-default-rules">extended default rules</link> in a normal module.</entry> <entry>dynamic</entry> <entry><option>-XNoExtendedDefaultRules</option></entry> </row> <row> + <entry><option>-XFlexibleContexts</option></entry> + <entry>Enable <link linkend="flexible-contexts">flexible contexts</link>. + Implied by <option>-XImplicitParams</option>.</entry> + <entry>dynamic</entry> + <entry><option>-XNoFlexibleContexts</option></entry> + </row> + <row> + <entry><option>-XFlexibleInstances</option></entry> + <entry>Enable <link linkend="instance-rules">flexible instances</link>. + Implies <option>-XTypeSynonymInstances</option>. Implied by <option>-XImplicitParams</option>.</entry> + <entry>dynamic</entry> + <entry><option>-XNoFlexibleInstances</option></entry> + </row> + <row> <entry><option>-XForeignFunctionInterface</option></entry> <entry>Enable <link linkend="ffi">foreign function interface</link>.</entry> <entry>dynamic</entry> <entry><option>-XNoForeignFunctionInterface</option></entry> </row> <row> + <entry><option>-XFunctionalDependencies</option></entry> + <entry>Enable <link linkend="functional-dependencies">functional dependencies</link>. + Implies <option>-XMultiParamTypeClasses</option>.</entry> + <entry>dynamic</entry> + <entry><option>-XNoFunctionalDependencies</option></entry> + </row> + <row> <entry><option>-XGADTs</option></entry> <entry>Enable <link linkend="gadt">generalised algebraic data types</link>. - </entry> + Implies <option>-XGADTSyntax</option> and <option>-XMonoLocalBinds</option>.</entry> <entry>dynamic</entry> <entry><option>-XNoGADTs</option></entry> </row> @@ -821,6 +930,12 @@ <entry><option>-XNoGADTSyntax</option></entry> </row> <row> + <entry><option>-XGeneralizedNewtypeDeriving</option></entry> + <entry>Enable <link linkend="newtype-deriving">newtype deriving</link>.</entry> + <entry>dynamic</entry> + <entry><option>-XNoGeneralizedNewtypeDeriving</option></entry> + </row> + <row> <entry><option>-XGenerics</option></entry> <entry>Deprecated, does nothing. No longer enables <link linkend="generic-classes">generic classes</link>. See also GHC's support for @@ -830,103 +945,74 @@ </row> <row> <entry><option>-XImplicitParams</option></entry> - <entry>Enable <link linkend="implicit-parameters">Implicit Parameters</link>.</entry> + <entry>Enable <link linkend="implicit-parameters">Implicit Parameters</link>. + Implies <option>-XFlexibleContexts</option> and <option>-XFlexibleInstances</option>.</entry> <entry>dynamic</entry> <entry><option>-XNoImplicitParams</option></entry> </row> <row> <entry><option>-XNoImplicitPrelude</option></entry> - <entry>Don't implicitly <literal>import Prelude</literal></entry> + <entry>Don't implicitly <literal>import Prelude</literal>. + Implied by <option>-XRebindableSyntax</option>.</entry> <entry>dynamic</entry> <entry><option>-XImplicitPrelude</option></entry> </row> <row> - <entry><option>-XIncoherentInstances</option></entry> - <entry>Enable <link linkend="instance-overlap">incoherent instances</link>. - Implies <option>-XOverlappingInstances</option> </entry> - <entry>dynamic</entry> - <entry><option>-XNoIncoherentInstances</option></entry> - </row> - <row> - <entry><option>-XNoMonomorphismRestriction</option></entry> - <entry>Disable the <link linkend="monomorphism">monomorphism restriction</link></entry> - <entry>dynamic</entry> - <entry><option>-XMonomorphismRrestriction</option></entry> - </row> - <row> - <entry><option>-XNegativeLiterals</option></entry> - <entry>Enable support for <link linkend="negative-literals">negative literals</link></entry> - <entry>dynamic</entry> - <entry><option>-XNoNegativeLiterals</option></entry> - </row> - <row> - <entry><option>-XNoNPlusKPatterns</option></entry> - <entry>Disable support for <literal>n+k</literal> patterns</entry> - <entry>dynamic</entry> - <entry><option>-XNPlusKPatterns</option></entry> - </row> - <row> - <entry><option>-XNumDecimals</option></entry> - <entry>Enable support for 'fractional' integer literals</entry> - <entry>dynamic</entry> - <entry><option>-XNoNumDecimals</option></entry> - </row> - <row> - <entry><option>-XOverlappingInstances</option></entry> - <entry>Enable <link linkend="instance-overlap">overlapping instances</link></entry> + <entry><option>-XImpredicativeTypes</option></entry> + <entry>Enable <link linkend="impredicative-polymorphism">impredicative types</link>. + Implies <option>-XRankNTypes</option>.</entry> <entry>dynamic</entry> - <entry><option>-XNoOverlappingInstances</option></entry> + <entry><option>-XNoImpredicativeTypes</option></entry> </row> <row> - <entry><option>-XOverloadedLists</option></entry> - <entry>Enable <link linkend="overloaded-lists">overloaded lists</link>. - </entry> + <entry><option>-XIncoherentInstances</option></entry> + <entry>Enable <link linkend="instance-overlap">incoherent instances</link>. + Implies <option>-XOverlappingInstances</option>.</entry> <entry>dynamic</entry> - <entry><option>-XNoOverloadedLists</option></entry> + <entry><option>-XNoIncoherentInstances</option></entry> </row> <row> - <entry><option>-XOverloadedStrings</option></entry> - <entry>Enable <link linkend="overloaded-strings">overloaded string literals</link>. - </entry> + <entry><option>-XInstanceSigs</option></entry> + <entry>Enable <link linkend="instance-sigs">instance signatures</link>.</entry> <entry>dynamic</entry> - <entry><option>-XNoOverloadedStrings</option></entry> + <entry><option>-XNoInstanceSigs</option></entry> </row> <row> - <entry><option>-XQuasiQuotes</option></entry> - <entry>Enable <link linkend="th-quasiquotation">quasiquotation</link>.</entry> + <entry><option>-XInterruptibleFFI</option></entry> + <entry>Enable interruptible FFI.</entry> <entry>dynamic</entry> - <entry><option>-XNoQuasiQuotes</option></entry> + <entry><option>-XNoInterruptibleFFI</option></entry> </row> <row> - <entry><option>-XRelaxedPolyRec</option></entry> - <entry>Relaxed checking for <link linkend="typing-binds">mutually-recursive polymorphic functions</link></entry> + <entry><option>-XKindSignatures</option></entry> + <entry>Enable <link linkend="kinding">kind signatures</link>. + Implied by <option>-XTypeFamilies</option> and <option>-XPolyKinds</option>.</entry> <entry>dynamic</entry> - <entry><option>-XNoRelaxedPolyRec</option></entry> + <entry><option>-XNoKindSignatures</option></entry> </row> <row> - <entry><option>-XNoTraditionalRecordSyntax</option></entry> - <entry>Disable support for traditional record syntax (as supported by Haskell 98) <literal>C {f = x}</literal></entry> + <entry><option>-XLambdaCase</option></entry> + <entry>Enable <link linkend="lambda-case">lambda-case expressions</link>.</entry> <entry>dynamic</entry> - <entry><option>-XTraditionalRecordSyntax</option></entry> + <entry><option>-XNoLambdaCase</option></entry> </row> <row> - <entry><option>-XTypeFamilies</option></entry> - <entry>Enable <link linkend="type-families">type families</link>.</entry> + <entry><option>-XLiberalTypeSynonyms</option></entry> + <entry>Enable <link linkend="type-synonyms">liberalised type synonyms</link>.</entry> <entry>dynamic</entry> - <entry><option>-XNoTypeFamilies</option></entry> + <entry><option>-XNoLiberalTypeSynonyms</option></entry> </row> <row> - <entry><option>-XUndecidableInstances</option></entry> - <entry>Enable <link linkend="undecidable-instances">undecidable instances</link></entry> + <entry><option>-XMagicHash</option></entry> + <entry>Allow "#" as a <link linkend="magic-hash">postfix modifier on identifiers</link>.</entry> <entry>dynamic</entry> - <entry><option>-XNoUndecidableInstances</option></entry> + <entry><option>-XNoMagicHash</option></entry> </row> <row> - <entry><option>-XPolyKinds</option></entry> - <entry>Enable <link linkend="kind-polymorphism">kind polymorphism</link>. - Implies <option>-XKindSignatures</option>.</entry> + <entry><option>-XMonadComprehensions</option></entry> + <entry>Enable <link linkend="monad-comprehensions">monad comprehensions</link>.</entry> <entry>dynamic</entry> - <entry><option>-XNoPolyKinds</option></entry> + <entry><option>-XNoMonadComprehensions</option></entry> </row> <row> <entry><option>-XMonoLocalBinds</option></entry> @@ -935,164 +1021,159 @@ </entry> <entry>dynamic</entry> <entry><option>-XNoMonoLocalBinds</option></entry> - </row> + </row> <row> - <entry><option>-XRebindableSyntax</option></entry> - <entry>Employ <link linkend="rebindable-syntax">rebindable syntax</link></entry> + <entry><option>-XNoMonomorphismRestriction</option></entry> + <entry>Disable the <link linkend="monomorphism">monomorphism restriction</link>.</entry> <entry>dynamic</entry> - <entry><option>-XNoRebindableSyntax</option></entry> + <entry><option>-XMonomorphismRrestriction</option></entry> </row> <row> - <entry><option>-XScopedTypeVariables</option></entry> - <entry>Enable <link linkend="scoped-type-variables">lexically-scoped type variables</link>. - </entry> + <entry><option>-XMultiParamTypeClasses</option></entry> + <entry>Enable <link linkend="multi-param-type-classes">multi parameter type classes</link>. + Implied by <option>-XFunctionalDependencies</option>.</entry> <entry>dynamic</entry> - <entry><option>-XNoScopedTypeVariables</option></entry> + <entry><option>-XNoMultiParamTypeClasses</option></entry> </row> <row> - <entry><option>-XTemplateHaskell</option></entry> - <entry>Enable <link linkend="template-haskell">Template Haskell</link>.</entry> + <entry><option>-XMultiWayIf</option></entry> + <entry>Enable <link linkend="multi-way-if">multi-way if-expressions</link>.</entry> <entry>dynamic</entry> - <entry><option>-XNoTemplateHaskell</option></entry> + <entry><option>-XNoMultiWayIf</option></entry> </row> <row> - <entry><option>-XBangPatterns</option></entry> - <entry>Enable <link linkend="bang-patterns">bang patterns</link>.</entry> + <entry><option>-XNamedFieldPuns</option></entry> + <entry>Enable <link linkend="record-puns">record puns</link>.</entry> <entry>dynamic</entry> - <entry><option>-XNoBangPatterns</option></entry> + <entry><option>-XNoNamedFieldPuns</option></entry> </row> <row> - <entry><option>-XCPP</option></entry> - <entry>Enable the <link linkend="c-pre-processor">C preprocessor</link>.</entry> + <entry><option>-XNegativeLiterals</option></entry> + <entry>Enable support for <link linkend="negative-literals">negative literals</link>.</entry> <entry>dynamic</entry> - <entry><option>-XNoCPP</option></entry> + <entry><option>-XNoNegativeLiterals</option></entry> </row> <row> - <entry><option>-XPatternGuards</option></entry> - <entry>Enable <link linkend="pattern-guards">pattern guards</link>.</entry> + <entry><option>-XNoNPlusKPatterns</option></entry> + <entry>Disable support for <literal>n+k</literal> patterns.</entry> <entry>dynamic</entry> - <entry><option>-XNoPatternGuards</option></entry> + <entry><option>-XNPlusKPatterns</option></entry> </row> <row> - <entry><option>-XViewPatterns</option></entry> - <entry>Enable <link linkend="view-patterns">view patterns</link>.</entry> + <entry><option>-XNullaryTypeClasses</option></entry> + <entry>Deprecated, does nothing. <link linkend="nullary-type-classes">nullary (no parameter) type classes</link> are now enabled using <option>-XMultiParamTypeClasses</option>.</entry> <entry>dynamic</entry> - <entry><option>-XNoViewPatterns</option></entry> + <entry><option>-XNoNullaryTypeClasses</option></entry> </row> <row> - <entry><option>-XUnicodeSyntax</option></entry> - <entry>Enable <link linkend="unicode-syntax">unicode syntax</link>.</entry> + <entry><option>-XNumDecimals</option></entry> + <entry>Enable support for 'fractional' integer literals.</entry> <entry>dynamic</entry> - <entry><option>-XNoUnicodeSyntax</option></entry> + <entry><option>-XNoNumDecimals</option></entry> </row> <row> - <entry><option>-XMagicHash</option></entry> - <entry>Allow "#" as a <link linkend="magic-hash">postfix modifier on identifiers</link>.</entry> + <entry><option>-XOverlappingInstances</option></entry> + <entry>Enable <link linkend="instance-overlap">overlapping instances</link>.</entry> <entry>dynamic</entry> - <entry><option>-XNoMagicHash</option></entry> + <entry><option>-XNoOverlappingInstances</option></entry> </row> <row> - <entry><option>-XExplicitForAll</option></entry> - <entry>Enable <link linkend="explicit-foralls">explicit universal quantification</link>. - Implied by <option>-XScopedTypeVariables</option>, - <option>-XLiberalTypeSynonyms</option>, - <option>-XRankNTypes</option>, - <option>-XExistentialQuantification</option> + <entry><option>-XOverloadedLists</option></entry> + <entry>Enable <link linkend="overloaded-lists">overloaded lists</link>. </entry> <entry>dynamic</entry> - <entry><option>-XNoExplicitForAll</option></entry> - </row> - <row> - <entry><option>-XPolymorphicComponents</option></entry> - <entry>Enable <link linkend="universal-quantification">polymorphic components for data constructors</link>.</entry> - <entry>dynamic, synonym for <option>-XRankNTypes</option></entry> - <entry><option>-XNoPolymorphicComponents</option></entry> + <entry><option>-XNoOverloadedLists</option></entry> </row> <row> - <entry><option>-XRank2Types</option></entry> - <entry>Enable <link linkend="universal-quantification">rank-2 types</link>.</entry> - <entry>dynamic, synonym for <option>-XRankNTypes</option></entry> - <entry><option>-XNoRank2Types</option></entry> + <entry><option>-XOverloadedStrings</option></entry> + <entry>Enable <link linkend="overloaded-strings">overloaded string literals</link>. + </entry> + <entry>dynamic</entry> + <entry><option>-XNoOverloadedStrings</option></entry> </row> <row> - <entry><option>-XRankNTypes</option></entry> - <entry>Enable <link linkend="universal-quantification">rank-N types</link>.</entry> + <entry><option>-XPackageImports</option></entry> + <entry>Enable <link linkend="package-imports">package-qualified imports</link>.</entry> <entry>dynamic</entry> - <entry><option>-XNoRankNTypes</option></entry> + <entry><option>-XNoPackageImports</option></entry> </row> <row> - <entry><option>-XImpredicativeTypes</option></entry> - <entry>Enable <link linkend="impredicative-polymorphism">impredicative types</link>.</entry> + <entry><option>-XParallelArrays</option></entry> + <entry>Enable parallel arrays. + Implies <option>-XParallelListComp</option>.</entry> <entry>dynamic</entry> - <entry><option>-XNoImpredicativeTypes</option></entry> + <entry><option>-XNoParallelArrays</option></entry> </row> <row> - <entry><option>-XExistentialQuantification</option></entry> - <entry>Enable <link linkend="existential-quantification">existential quantification</link>.</entry> + <entry><option>-XParallelListComp</option></entry> + <entry>Enable <link linkend="parallel-list-comprehensions">parallel list comprehensions</link>. + Implied by <option>-XParallelArrays</option>.</entry> <entry>dynamic</entry> - <entry><option>-XNoExistentialQuantification</option></entry> + <entry><option>-XNoParallelListComp</option></entry> </row> <row> - <entry><option>-XKindSignatures</option></entry> - <entry>Enable <link linkend="kinding">kind signatures</link>.</entry> + <entry><option>-XPatternGuards</option></entry> + <entry>Enable <link linkend="pattern-guards">pattern guards</link>.</entry> <entry>dynamic</entry> - <entry><option>-XNoKindSignatures</option></entry> + <entry><option>-XNoPatternGuards</option></entry> </row> <row> - <entry><option>-XEmptyDataDecls</option></entry> - <entry>Enable empty data declarations.</entry> + <entry><option>-XPatternSynonyms</option></entry> + <entry>Enable <link linkend="pattern-synonyms">pattern synonyms</link>.</entry> <entry>dynamic</entry> - <entry><option>-XNoEmptyDataDecls</option></entry> + <entry><option>-XNoPatternSynonyms</option></entry> </row> <row> - <entry><option>-XParallelListComp</option></entry> - <entry>Enable <link linkend="parallel-list-comprehensions">parallel list comprehensions</link>.</entry> + <entry><option>-XPolyKinds</option></entry> + <entry>Enable <link linkend="kind-polymorphism">kind polymorphism</link>. + Implies <option>-XKindSignatures</option>.</entry> <entry>dynamic</entry> - <entry><option>-XNoParallelListComp</option></entry> + <entry><option>-XNoPolyKinds</option></entry> </row> <row> - <entry><option>-XTransformListComp</option></entry> - <entry>Enable <link linkend="generalised-list-comprehensions">generalised list comprehensions</link>.</entry> - <entry>dynamic</entry> - <entry><option>-XNoTransformListComp</option></entry> + <entry><option>-XPolymorphicComponents</option></entry> + <entry>Enable <link linkend="universal-quantification">polymorphic components for data constructors</link>.</entry> + <entry>dynamic, synonym for <option>-XRankNTypes</option></entry> + <entry><option>-XNoPolymorphicComponents</option></entry> </row> <row> - <entry><option>-XMonadComprehensions</option></entry> - <entry>Enable <link linkend="monad-comprehensions">monad comprehensions</link>.</entry> + <entry><option>-XPostfixOperators</option></entry> + <entry>Enable <link linkend="postfix-operators">postfix operators</link>.</entry> <entry>dynamic</entry> - <entry><option>-XNoMonadComprehensions</option></entry> + <entry><option>-XNoPostfixOperators</option></entry> </row> <row> - <entry><option>-XUnliftedFFITypes</option></entry> - <entry>Enable unlifted FFI types.</entry> + <entry><option>-XQuasiQuotes</option></entry> + <entry>Enable <link linkend="th-quasiquotation">quasiquotation</link>.</entry> <entry>dynamic</entry> - <entry><option>-XNoUnliftedFFITypes</option></entry> + <entry><option>-XNoQuasiQuotes</option></entry> </row> <row> - <entry><option>-XInterruptibleFFI</option></entry> - <entry>Enable interruptible FFI.</entry> - <entry>dynamic</entry> - <entry><option>-XNoInterruptibleFFI</option></entry> + <entry><option>-XRank2Types</option></entry> + <entry>Enable <link linkend="universal-quantification">rank-2 types</link>.</entry> + <entry>dynamic, synonym for <option>-XRankNTypes</option></entry> + <entry><option>-XNoRank2Types</option></entry> </row> <row> - <entry><option>-XLiberalTypeSynonyms</option></entry> - <entry>Enable <link linkend="type-synonyms">liberalised type synonyms</link>.</entry> + <entry><option>-XRankNTypes</option></entry> + <entry>Enable <link linkend="universal-quantification">rank-N types</link>. + Implied by <option>-XImpredicativeTypes</option>.</entry> <entry>dynamic</entry> - <entry><option>-XNoLiberalTypeSynonyms</option></entry> + <entry><option>-XNoRankNTypes</option></entry> </row> <row> - <entry><option>-XTypeOperators</option></entry> - <entry>Enable <link linkend="type-operators">type operators</link>.</entry> + <entry><option>-XRebindableSyntax</option></entry> + <entry>Employ <link linkend="rebindable-syntax">rebindable syntax</link>. + Implies <option>-XNoImplicitPrelude</option>.</entry> <entry>dynamic</entry> - <entry><option>-XNoTypeOperators</option></entry> + <entry><option>-XNoRebindableSyntax</option></entry> </row> <row> - <entry><option>-XExplicitNamespaces</option></entry> - <entry>Enable using the keyword <literal>type</literal> to specify the namespace of - entries in imports and exports (<xref linkend="explicit-namespaces"/>). - Implied by <option>-XTypeOperators</option> and <option>-XTypeFamilies</option>.</entry> + <entry><option>-XRecordWildCards</option></entry> + <entry>Enable <link linkend="record-wildcards">record wildcards</link>. + Implies <option>-XDisambiguateRecordFields</option>.</entry> <entry>dynamic</entry> - <entry><option>-XNoExplicitNamespaces</option></entry> + <entry><option>-XNoRecordWildCards</option></entry> </row> <row> <entry><option>-XRecursiveDo</option></entry> @@ -1101,34 +1182,30 @@ <entry><option>-XNoRecursiveDo</option></entry> </row> <row> - <entry><option>-XParallelArrays</option></entry> - <entry>Enable parallel arrays.</entry> - <entry>dynamic</entry> - <entry><option>-XNoParallelArrays</option></entry> - </row> - <row> - <entry><option>-XRecordWildCards</option></entry> - <entry>Enable <link linkend="record-wildcards">record wildcards</link>.</entry> + <entry><option>-XRelaxedPolyRec</option></entry> + <entry><emphasis>(deprecated)</emphasis> Relaxed checking for + <link linkend="typing-binds">mutually-recursive polymorphic functions</link>.</entry> <entry>dynamic</entry> - <entry><option>-XNoRecordWildCards</option></entry> + <entry><option>-XNoRelaxedPolyRec</option></entry> </row> <row> - <entry><option>-XNamedFieldPuns</option></entry> - <entry>Enable <link linkend="record-puns">record puns</link>.</entry> + <entry><option>-XRoleAnnotations</option></entry> + <entry>Enable <link linkend="role-annotations">role annotations</link>.</entry> <entry>dynamic</entry> - <entry><option>-XNoNamedFieldPuns</option></entry> + <entry><option>-XNoRoleAnnotations</option></entry> </row> <row> - <entry><option>-XDisambiguateRecordFields</option></entry> - <entry>Enable <link linkend="disambiguate-fields">record field disambiguation</link>. </entry> + <entry><option>-XSafe</option></entry> + <entry>Enable the <link linkend="safe-haskell">Safe Haskell</link> Safe mode.</entry> <entry>dynamic</entry> - <entry><option>-XNoDisambiguateRecordFields</option></entry> + <entry><option>-</option></entry> </row> <row> - <entry><option>-XUnboxedTuples</option></entry> - <entry>Enable <link linkend="unboxed-tuples">unboxed tuples</link>.</entry> + <entry><option>-XScopedTypeVariables</option></entry> + <entry>Enable <link linkend="scoped-type-variables">lexically-scoped type variables</link>. + </entry> <entry>dynamic</entry> - <entry><option>-XNoUnboxedTuples</option></entry> + <entry><option>-XNoScopedTypeVariables</option></entry> </row> <row> <entry><option>-XStandaloneDeriving</option></entry> @@ -1137,83 +1214,80 @@ <entry><option>-XNoStandaloneDeriving</option></entry> </row> <row> - <entry><option>-XTypeSynonymInstances</option></entry> - <entry>Enable <link linkend="flexible-instance-head">type synonyms in instance heads</link>.</entry> - <entry>dynamic</entry> - <entry><option>-XNoTypeSynonymInstances</option></entry> - </row> - <row> - <entry><option>-XFlexibleContexts</option></entry> - <entry>Enable <link linkend="flexible-contexts">flexible contexts</link>.</entry> + <entry><option>-XTemplateHaskell</option></entry> + <entry>Enable <link linkend="template-haskell">Template Haskell</link>.</entry> <entry>dynamic</entry> - <entry><option>-XNoFlexibleContexts</option></entry> + <entry><option>-XNoTemplateHaskell</option></entry> </row> <row> - <entry><option>-XFlexibleInstances</option></entry> - <entry>Enable <link linkend="instance-rules">flexible instances</link>. - Implies <option>-XTypeSynonymInstances</option> </entry> + <entry><option>-XNoTraditionalRecordSyntax</option></entry> + <entry>Disable support for traditional record syntax (as supported by Haskell 98) <literal>C {f = x}</literal></entry> <entry>dynamic</entry> - <entry><option>-XNoFlexibleInstances</option></entry> + <entry><option>-XTraditionalRecordSyntax</option></entry> </row> <row> - <entry><option>-XConstrainedClassMethods</option></entry> - <entry>Enable <link linkend="class-method-types">constrained class methods</link>.</entry> + <entry><option>-XTransformListComp</option></entry> + <entry>Enable <link linkend="generalised-list-comprehensions">generalised list comprehensions</link>.</entry> <entry>dynamic</entry> - <entry><option>-XNoConstrainedClassMethods</option></entry> + <entry><option>-XNoTransformListComp</option></entry> </row> <row> - <entry><option>-XDefaultSignatures</option></entry> - <entry>Enable <link linkend="class-default-signatures">default signatures</link>.</entry> + <entry><option>-XTrustworthy</option></entry> + <entry>Enable the <link linkend="safe-haskell">Safe Haskell</link> Trustworthy mode.</entry> <entry>dynamic</entry> - <entry><option>-XNoDefaultSignatures</option></entry> + <entry><option>-</option></entry> </row> <row> - <entry><option>-XMultiParamTypeClasses</option></entry> - <entry>Enable <link linkend="multi-param-type-classes">multi parameter type classes</link>.</entry> + <entry><option>-XTupleSections</option></entry> + <entry>Enable <link linkend="tuple-sections">tuple sections</link>.</entry> <entry>dynamic</entry> - <entry><option>-XNoMultiParamTypeClasses</option></entry> + <entry><option>-XNoTupleSections</option></entry> </row> <row> - <entry><option>-XNullaryTypeClasses</option></entry> - <entry>Enable <link linkend="nullary-type-classes">nullary (no parameter) type classes</link>.</entry> + <entry><option>-XTypeFamilies</option></entry> + <entry>Enable <link linkend="type-families">type families</link>. + Implies <option>-XExplicitNamespaces</option>, <option>-XKindSignatures</option> + and <option>-XMonoLocalBinds</option>.</entry> <entry>dynamic</entry> - <entry><option>-XNoNullaryTypeClasses</option></entry> + <entry><option>-XNoTypeFamilies</option></entry> </row> <row> - <entry><option>-XFunctionalDependencies</option></entry> - <entry>Enable <link linkend="functional-dependencies">functional dependencies</link>.</entry> + <entry><option>-XTypeOperators</option></entry> + <entry>Enable <link linkend="type-operators">type operators</link>. + Implies <option>-XExplicitNamespaces</option>.</entry> <entry>dynamic</entry> - <entry><option>-XNoFunctionalDependencies</option></entry> + <entry><option>-XNoTypeOperators</option></entry> </row> <row> - <entry><option>-XPackageImports</option></entry> - <entry>Enable <link linkend="package-imports">package-qualified imports</link>.</entry> + <entry><option>-XTypeSynonymInstances</option></entry> + <entry>Enable <link linkend="flexible-instance-head">type synonyms in instance heads</link>. + Implied by <option>-XFlexibleInstances</option>.</entry> <entry>dynamic</entry> - <entry><option>-XNoPackageImports</option></entry> + <entry><option>-XNoTypeSynonymInstances</option></entry> </row> <row> - <entry><option>-XLambdaCase</option></entry> - <entry>Enable <link linkend="lambda-case">lambda-case expressions</link>.</entry> + <entry><option>-XUnboxedTuples</option></entry> + <entry>Enable <link linkend="unboxed-tuples">unboxed tuples</link>.</entry> <entry>dynamic</entry> - <entry><option>-XNoLambdaCase</option></entry> + <entry><option>-XNoUnboxedTuples</option></entry> </row> <row> - <entry><option>-XMultiWayIf</option></entry> - <entry>Enable <link linkend="multi-way-if">multi-way if-expressions</link>.</entry> + <entry><option>-XUndecidableInstances</option></entry> + <entry>Enable <link linkend="undecidable-instances">undecidable instances</link>.</entry> <entry>dynamic</entry> - <entry><option>-XNoMultiWayIf</option></entry> + <entry><option>-XNoUndecidableInstances</option></entry> </row> <row> - <entry><option>-XSafe</option></entry> - <entry>Enable the <link linkend="safe-haskell">Safe Haskell</link> Safe mode.</entry> + <entry><option>-XUnicodeSyntax</option></entry> + <entry>Enable <link linkend="unicode-syntax">unicode syntax</link>.</entry> <entry>dynamic</entry> - <entry><option>-</option></entry> + <entry><option>-XNoUnicodeSyntax</option></entry> </row> <row> - <entry><option>-XTrustworthy</option></entry> - <entry>Enable the <link linkend="safe-haskell">Safe Haskell</link> Trustworthy mode.</entry> + <entry><option>-XUnliftedFFITypes</option></entry> + <entry>Enable unlifted FFI types.</entry> <entry>dynamic</entry> - <entry><option>-</option></entry> + <entry><option>-XNoUnliftedFFITypes</option></entry> </row> <row> <entry><option>-XUnsafe</option></entry> @@ -1222,10 +1296,10 @@ <entry><option>-</option></entry> </row> <row> - <entry><option>-fpackage-trust</option></entry> - <entry>Enable <link linkend="safe-haskell">Safe Haskell</link> trusted package requirement for trustworthy modules.</entry> + <entry><option>-XViewPatterns</option></entry> + <entry>Enable <link linkend="view-patterns">view patterns</link>.</entry> <entry>dynamic</entry> - <entry><option>-</option></entry> + <entry><option>-XNoViewPatterns</option></entry> </row> </tbody> </tgroup> @@ -2141,6 +2215,12 @@ <entry>-</entry> </row> <row> + <entry><option>-fwrite-interface</option></entry> + <entry>Always write interface files</entry> + <entry>dynamic</entry> + <entry>-</entry> + </row> + <row> <entry><option>-fbyte-code</option></entry> <entry>Generate byte-code</entry> <entry>dynamic</entry> @@ -2607,34 +2687,6 @@ <sect2> - <title>External core file options</title> - - <para><xref linkend="ext-core"/></para> - - <informaltable> - <tgroup cols="4" align="left" colsep="1" rowsep="1"> - <thead> - <row> - <entry>Flag</entry> - <entry>Description</entry> - <entry>Static/Dynamic</entry> - <entry>Reverse</entry> - </row> - </thead> - <tbody> - <row> - <entry><option>-fext-core</option></entry> - <entry>Generate <filename>.hcr</filename> external Core files</entry> - <entry>dynamic</entry> - <entry>-</entry> - </row> - </tbody> - </tgroup> - </informaltable> - </sect2> - - - <sect2> <title>Compiler debugging options</title> <para><xref linkend="options-debugging"/></para> diff --git a/docs/users_guide/ghci.xml b/docs/users_guide/ghci.xml index 912ecb25ce..729f96f244 100644 --- a/docs/users_guide/ghci.xml +++ b/docs/users_guide/ghci.xml @@ -2432,7 +2432,9 @@ Prelude> :. cmds.ghci <listitem> <para>Opens an editor to edit the file <replaceable>file</replaceable>, or the most recently loaded - module if <replaceable>file</replaceable> is omitted. The + module if <replaceable>file</replaceable> is omitted. + If there were errors during the last loading, + the cursor will be positioned at the line of the first error. The editor to invoke is taken from the <literal>EDITOR</literal> environment variable, or a default editor on your system if <literal>EDITOR</literal> is not set. You can change the @@ -3294,12 +3296,38 @@ Prelude> :set -fno-warn-incomplete-patterns -XNoMultiParamTypeClasses <title>Setting options for interactive evaluation only</title> <para> - GHCi actually maintains two sets of options: one set that - applies when loading modules, and another set that applies for - expressions and commands typed at the prompt. The - <literal>:set</literal> command modifies both, but there is + GHCi actually maintains <emphasis>two</emphasis> sets of options: +<itemizedlist> +<listitem><para> + The <emphasis>loading options</emphasis> apply when loading modules +</para></listitem> +<listitem><para> + The <emphasis>interactive options</emphasis> apply when evaluating expressions and commands typed at the GHCi prompt. +</para></listitem> +</itemizedlist> +The <literal>:set</literal> command modifies both, but there is also a <literal>:seti</literal> command (for "set - interactive") that affects only the second set. + interactive") that affects only the interactive options set. + </para> + + <para> + It is often useful to change the interactive options, + without having that option apply to loaded modules + too. For example +<screen> +:seti -XMonoLocalBinds +</screen> + It would be undesirable if <option>-XMonoLocalBinds</option> were to + apply to loaded modules too: that might cause a compilation error, but + more commonly it will cause extra recompilation, because GHC will think + that it needs to recompile the module because the flags have changed. + </para> + + <para> + If you are setting language options in your <literal>.ghci</literal> file, it is good practice + to use <literal>:seti</literal> rather than <literal>:set</literal>, + unless you really do want them to apply to all modules you + load in GHCi. </para> <para> @@ -3307,8 +3335,6 @@ Prelude> :set -fno-warn-incomplete-patterns -XNoMultiParamTypeClasses <literal>:set</literal> and <literal>:seti</literal> commands respectively, with no arguments. For example, in a clean GHCi session we might see something like this: - </para> - <screen> Prelude> :seti base language is: Haskell2010 @@ -3322,38 +3348,24 @@ other dynamic, non-language, flag settings: -fimplicit-import-qualified warning settings: </screen> - <para> - Note that the option <option>-XExtendedDefaultRules</option> - is on, because we apply special defaulting rules to + </para> +<para> +The two sets of options are initialised as follows. First, both sets of options +are initialised as described in <xref linkend="ghci-dot-files"/>. +Then the interactive options are modified as follows: +<itemizedlist> +<listitem><para> + The option <option>-XExtendedDefaultRules</option> + is enabled, in order to apply special defaulting rules to expressions typed at the prompt (see <xref linkend="extended-default-rules" />). - </para> - - <para> - Furthermore, the Monomorphism Restriction is disabled by default in - GHCi (see <xref linkend="monomorphism" />). - </para> - - <para> - It is often useful to change the language options for expressions typed - at the prompt only, without having that option apply to loaded modules - too. For example -<screen> -:seti -XMonoLocalBinds -</screen> - It would be undesirable if <option>-XMonoLocalBinds</option> were to - apply to loaded modules too: that might cause a compilation error, but - more commonly it will cause extra recompilation, because GHC will think - that it needs to recompile the module because the flags have changed. - </para> + </para></listitem> - <para> - It is therefore good practice if you are setting language - options in your <literal>.ghci</literal> file, to use - <literal>:seti</literal> rather than <literal>:set</literal> - unless you really do want them to apply to all modules you - load in GHCi. - </para> +<listitem> <para> + The Monomorphism Restriction is disabled (see <xref linkend="monomorphism" />). + </para></listitem> +</itemizedlist> +</para> </sect2> </sect1> diff --git a/docs/users_guide/glasgow_exts.xml b/docs/users_guide/glasgow_exts.xml index acc796371a..9acb56fc29 100644 --- a/docs/users_guide/glasgow_exts.xml +++ b/docs/users_guide/glasgow_exts.xml @@ -480,6 +480,26 @@ Indeed, the bindings can even be recursive. </para> </sect2> + <sect2 id="binary-literals"> + <title>Binary integer literals</title> + <para> + Haskell 2010 and Haskell 98 allows for integer literals to + be given in decimal, octal (prefixed by + <literal>0o</literal> or <literal>0O</literal>), or + hexadecimal notation (prefixed by <literal>0x</literal> or + <literal>0X</literal>). + </para> + + <para> + The language extension <option>-XBinaryLiterals</option> + adds support for expressing integer literals in binary + notation with the prefix <literal>0b</literal> or + <literal>0B</literal>. For instance, the binary integer + literal <literal>0b11001001</literal> will be desugared into + <literal>fromInteger 201</literal> when + <option>-XBinaryLiterals</option> is enabled. + </para> + </sect2> <!-- ====================== HIERARCHICAL MODULES ======================= --> @@ -971,25 +991,27 @@ right-hand side. </para> <para> -The semantics of a unidirectional pattern synonym declaration and -usage are as follows: - -<itemizedlist> +The syntax and semantics of pattern synonyms are elaborated in the +following subsections. +See the <ulink +url="http://ghc.haskell.org/trac/ghc/wiki/PatternSynonyms">Wiki +page</ulink> for more details. +</para> -<listitem> Syntax: +<sect3> <title>Syntax and scoping of pattern synonyms</title> <para> A pattern synonym declaration can be either unidirectional or bidirectional. The syntax for unidirectional pattern synonyms is: -</para> <programlisting> pattern Name args <- pat </programlisting> -<para> and the syntax for bidirectional pattern synonyms is: -</para> <programlisting> pattern Name args = pat </programlisting> + Either prefix or infix syntax can be + used. +</para> <para> Pattern synonym declarations can only occur in the top level of a module. In particular, they are not allowed as local @@ -997,20 +1019,6 @@ bidirectional. The syntax for unidirectional pattern synonyms is: technical restriction that will be lifted in later versions. </para> <para> - The name of the pattern synonym itself is in the same namespace as - proper data constructors. Either prefix or infix syntax can be - used. In export/import specifications, you have to prefix pattern - names with the <literal>pattern</literal> keyword, e.g.: -</para> -<programlisting> - module Example (pattern Single) where - pattern Single x = [x] -</programlisting> -</listitem> - -<listitem> Scoping: - -<para> The variables in the left-hand side of the definition are bound by the pattern on the right-hand side. For bidirectional pattern synonyms, all the variables of the right-hand side must also occur @@ -1022,10 +1030,35 @@ bidirectional. The syntax for unidirectional pattern synonyms is: <para> Pattern synonyms cannot be defined recursively. </para> +</sect3> -</listitem> +<sect3 id="patsyn-impexp"> <title>Import and export of pattern synonyms</title> + +<para> + The name of the pattern synonym itself is in the same namespace as + proper data constructors. In an export or import specification, + you must prefix pattern + names with the <literal>pattern</literal> keyword, e.g.: +<programlisting> + module Example (pattern Single) where + pattern Single x = [x] +</programlisting> +Without the <literal>pattern</literal> prefix, <literal>Single</literal> would +be interpreted as a type constructor in the export list. +</para> +<para> +You may also use the <literal>pattern</literal> keyword in an import/export +specification to import or export an ordinary data constructor. For example: +<programlisting> + import Data.Maybe( pattern Just ) +</programlisting> +would bring into scope the data constructor <literal>Just</literal> from the +<literal>Maybe</literal> type, without also bringing the type constructor +<literal>Maybe</literal> into scope. +</para> +</sect3> -<listitem> Typing: +<sect3> <title>Typing of pattern synonyms</title> <para> Given a pattern synonym definition of the form @@ -1100,10 +1133,9 @@ pattern (Show b) => ExNumPat b :: (Num a, Eq a) => T a <programlisting> ExNumPat :: (Show b, Num a, Eq a) => b -> T t </programlisting> +</sect3> -</listitem> - -<listitem> Matching: +<sect3><title>Matching of pattern synonyms</title> <para> A pattern synonym occurrence in a pattern is evaluated by first @@ -1125,8 +1157,6 @@ f' _ = False <para> Note that the strictness of <literal>f</literal> differs from that of <literal>g</literal> defined below: -</para> - <programlisting> g [True, True] = True g _ = False @@ -1136,9 +1166,8 @@ g _ = False *Main> g (False:undefined) False </programlisting> -</listitem> -</itemizedlist> </para> +</sect3> </sect2> @@ -1883,7 +1912,8 @@ the comprehension being over an arbitrary monad. functions <literal>(>>=)</literal>, <literal>(>>)</literal>, and <literal>fail</literal>, are in scope (not the Prelude - versions). List comprehensions, mdo (<xref linkend="recursive-do-notation"/>), and parallel array + versions). List comprehensions, <literal>mdo</literal> + (<xref linkend="recursive-do-notation"/>), and parallel array comprehensions, are unaffected. </para></listitem> <listitem> @@ -2391,6 +2421,35 @@ necessary to enable them. </sect2> <sect2 id="package-imports"> +<title>Import and export extensions</title> + +<sect3> + <title>Hiding things the imported module doesn't export</title> + +<para> +Technically in Haskell 2010 this is illegal: +<programlisting> +module A( f ) where + f = True + +module B where + import A hiding( g ) -- A does not export g + g = f +</programlisting> +The <literal>import A hiding( g )</literal> in module <literal>B</literal> +is technically an error (<ulink url="http://www.haskell.org/onlinereport/haskell2010/haskellch5.html#x11-1020005.3.1">Haskell Report, 5.3.1</ulink>) +because <literal>A</literal> does not export <literal>g</literal>. +However GHC allows it, in the interests of supporting backward compatibility; for example, a newer version of +<literal>A</literal> might export <literal>g</literal>, and you want <literal>B</literal> to work +in either case. +</para> +<para> +The warning <literal>-fwarn-dodgy-imports</literal>, which is off by default but included with <literal>-W</literal>, +warns if you hide something that the imported module does not export. +</para> +</sect3> + +<sect3> <title>Package-qualified imports</title> <para>With the <option>-XPackageImports</option> flag, GHC allows @@ -2415,9 +2474,9 @@ import "network" Network.Socket packages when APIs change. It can lead to fragile dependencies in the common case: modules occasionally move from one package to another, rendering any package-qualified imports broken.</para> -</sect2> +</sect3> -<sect2 id="safe-imports-ext"> +<sect3 id="safe-imports-ext"> <title>Safe imports</title> <para>With the <option>-XSafe</option>, <option>-XTrustworthy</option> @@ -2435,9 +2494,9 @@ import safe qualified Network.Socket as NS safely imported. For a description of when a import is considered safe see <xref linkend="safe-haskell"/></para> -</sect2> +</sect3> -<sect2 id="explicit-namespaces"> +<sect3 id="explicit-namespaces"> <title>Explicit namespaces in import/export</title> <para> In an import or export list, such as @@ -2465,6 +2524,14 @@ disambiguate this case, thus: The extension <option>-XExplicitNamespaces</option> is implied by <option>-XTypeOperators</option> and (for some reason) by <option>-XTypeFamilies</option>. </para> +<para> +In addition, with <option>-XPatternSynonyms</option> you can prefix the name of +a data constructor in an import or export list with the keyword <literal>pattern</literal>, +to allow the import or export of a data constructor without its parent type constructor +(see <xref linkend="patsyn-impexp"/>). +</para> +</sect3> + </sect2> <sect2 id="syntax-stolen"> @@ -3882,7 +3949,11 @@ defined in <literal>Data.Foldable</literal>. <listitem><para> With <option>-XDeriveTraversable</option>, you can derive instances of the class <literal>Traversable</literal>, -defined in <literal>Data.Traversable</literal>. +defined in <literal>Data.Traversable</literal>. Since the <literal>Traversable</literal> +instance dictates the instances of <literal>Functor</literal> and +<literal>Foldable</literal>, you'll probably want to derive them too, so +<option>-XDeriveTraversable</option> implies +<option>-XDeriveFunctor</option> and <option>-XDeriveFoldable</option>. </para></listitem> </itemizedlist> You can also use a standalone deriving declaration instead @@ -4317,7 +4388,9 @@ We use default signatures to simplify generic programming in GHC <sect3 id="nullary-type-classes"> <title>Nullary type classes</title> -Nullary (no parameter) type classes are enabled with <option>-XNullaryTypeClasses</option>. +Nullary (no parameter) type classes are enabled with +<option>-XMultiTypeClasses</option>; historically, they were enabled with the +(now deprecated) <option>-XNullaryTypeClasses</option>. Since there are no available parameters, there can be at most one instance of a nullary class. A nullary type class might be used to document some assumption in a type signature (such as reliance on the Riemann hypothesis) or add some @@ -4938,7 +5011,8 @@ with <option>-fcontext-stack=</option><emphasis>N</emphasis>. In general, as discussed in <xref linkend="instance-resolution"/>, <emphasis>GHC requires that it be unambiguous which instance declaration -should be used to resolve a type-class constraint</emphasis>. This behaviour +should be used to resolve a type-class constraint</emphasis>. +This behaviour can be modified by two flags: <option>-XOverlappingInstances</option> <indexterm><primary>-XOverlappingInstances </primary></indexterm> @@ -4947,6 +5021,8 @@ and <option>-XIncoherentInstances</option> </primary></indexterm>, as this section discusses. Both these flags are dynamic flags, and can be set on a per-module basis, using an <literal>LANGUAGE</literal> pragma if desired (<xref linkend="language-pragma"/>).</para> + + <para> The <option>-XOverlappingInstances</option> flag instructs GHC to loosen the instance resolution described in <xref linkend="instance-resolution"/>, by @@ -4954,18 +5030,83 @@ allowing more than one instance to match, <emphasis>provided there is a most specific one</emphasis>. The <option>-XIncoherentInstances</option> flag further loosens the resolution, by allowing more than one instance to match, irespective of whether there is a most specific one. +The <option>-XIncoherentInstances</option> flag implies the +<option>-XOverlappingInstances</option> flag, but not vice versa. </para> <para> -For example, consider +A more precise specification is as follows. +The willingness to be overlapped or incoherent is a property of +the <emphasis>instance declaration</emphasis> itself, controlled by the +presence or otherwise of the <option>-XOverlappingInstances</option> +and <option>-XIncoherentInstances</option> flags when that instance declaration is +being compiled. Now suppose that, in some client module, we are searching for an instance of the +<emphasis>target constraint</emphasis> <literal>(C ty1 .. tyn)</literal>. +The search works like this. +<itemizedlist> +<listitem><para> +Find all instances I that <emphasis>match</emphasis> the target constraint; +that is, the target constraint is a substitution instance of I. These +instance declarations are the <emphasis>candidates</emphasis>. +</para></listitem> + +<listitem><para> +Find all <emphasis>non-candidate</emphasis> instances +that <emphasis>unify</emphasis> with the target constraint. +Such non-candidates instances might match when the target constraint is further +instantiated. If all of them were compiled with +<option>-XIncoherentInstances</option>, proceed; if not, the search fails. +</para></listitem> + +<listitem><para> +Eliminate any candidate IX for which both of the following hold: + +<itemizedlist> +<listitem><para>There is another candidate IY that is strictly more specific; +that is, IY is a substitution instance of IX but not vice versa. +</para></listitem> +<listitem><para>Either IX or IY was compiled with +<option>-XOverlappingInstances</option>. +</para></listitem> +</itemizedlist> + +</para></listitem> + +<listitem><para> +If only one candidate remains, pick it. +Otherwise if all remaining candidates were compiled with +<option>-XInccoherentInstances</option>, pick an arbitrary candidate. +</para></listitem> + +</itemizedlist> +These rules make it possible for a library author to design a library that relies on +overlapping instances without the library client having to know. +</para> +<para> +Errors are reported <emphasis>lazily</emphasis> (when attempting to solve a constraint), rather than <emphasis>eagerly</emphasis> +(when the instances themselves are defined). So for example +<programlisting> + instance C Int b where .. + instance C a Bool where .. +</programlisting> +These potentially overlap, but GHC will not complain about the instance declarations +themselves, regardless of flag settings. If we later try to solve the constraint +<literal>(C Int Char)</literal> then only the first instance matches, and all is well. +Similarly with <literal>(C Bool Bool)</literal>. But if we try to solve <literal>(C Int Bool)</literal>, +both instances match and an error is reported. +</para> + +<para> +As a more substantial example of the rules in action, consider <programlisting> instance context1 => C Int b where ... -- (A) instance context2 => C a Bool where ... -- (B) instance context3 => C a [b] where ... -- (C) instance context4 => C Int [Int] where ... -- (D) </programlisting> -compiled with <option>-XOverlappingInstances</option> enabled. The constraint -<literal>C Int [Int]</literal> matches instances (A), (C) and (D), but the last +compiled with <option>-XOverlappingInstances</option> enabled. Now suppose that the type inference +engine needs to solve The constraint +<literal>C Int [Int]</literal>. This constraint matches instances (A), (C) and (D), but the last is more specific, and hence is chosen. </para> <para>If (D) did not exist then (A) and (C) would still be matched, but neither is @@ -4981,7 +5122,7 @@ the head of former is a substitution instance of the latter. For example substituting <literal>a:=Int</literal>. </para> <para> -However, GHC is conservative about committing to an overlapping instance. For example: +GHC is conservative about committing to an overlapping instance. For example: <programlisting> f :: [b] -> [b] f x = ... @@ -5078,56 +5219,6 @@ the program prints would be to reject module <literal>Help</literal> on the grounds that a later instance declaration might overlap the local one.) </para> -<para> -The willingness to be overlapped or incoherent is a property of -the <emphasis>instance declaration</emphasis> itself, controlled by the -presence or otherwise of the <option>-XOverlappingInstances</option> -and <option>-XIncoherentInstances</option> flags when that module is -being defined. Suppose we are searching for an instance of the -<emphasis>target constraint</emphasis> <literal>(C ty1 .. tyn)</literal>. -The search works like this. -<itemizedlist> -<listitem><para> -Find all instances I that <emphasis>match</emphasis> the target constraint; -that is, the target constraint is a substitution instance of I. These -instance declarations are the <emphasis>candidates</emphasis>. -</para></listitem> - -<listitem><para> -Find all <emphasis>non-candidate</emphasis> instances -that <emphasis>unify</emphasis> with the target constraint. -Such non-candidates instances might match when the target constraint is further -instantiated. If all of them were compiled with -<option>-XIncoherentInstances</option>, proceed; if not, the search fails. -</para></listitem> - -<listitem><para> -Eliminate any candidate IX for which both of the following hold: - -<itemizedlist> -<listitem><para>There is another candidate IY that is strictly more specific; -that is, IY is a substitution instance of IX but not vice versa. -</para></listitem> -<listitem><para>Either IX or IY was compiled with -<option>-XOverlappingInstances</option>. -</para></listitem> -</itemizedlist> - -</para></listitem> - -<listitem><para> -If only one candidate remains, pick it. -Otherwise if all remaining candidates were compiled with -<option>-XInccoherentInstances</option>, pick an arbitrary candidate. -</para></listitem> - -</itemizedlist> -These rules make it possible for a library author to design a library that relies on -overlapping instances without the library client having to know. -</para> -<para>The <option>-XIncoherentInstances</option> flag implies the -<option>-XOverlappingInstances</option> flag, but not vice versa. -</para> </sect3> <sect3 id="instance-sigs"> @@ -5201,21 +5292,30 @@ it explicitly (for example, to give an instance declaration for it), you can imp from module <literal>GHC.Exts</literal>. </para> <para> -Haskell's defaulting mechanism is extended to cover string literals, when <option>-XOverloadedStrings</option> is specified. +Haskell's defaulting mechanism (<ulink url="http://www.haskell.org/onlinereport/decls.html#sect4.3.4">Haskell Report, Section 4.3.4</ulink>) +is extended to cover string literals, when <option>-XOverloadedStrings</option> is specified. Specifically: <itemizedlist> <listitem><para> -Each type in a default declaration must be an +Each type in a <literal>default</literal> declaration must be an instance of <literal>Num</literal> <emphasis>or</emphasis> of <literal>IsString</literal>. </para></listitem> <listitem><para> -The standard defaulting rule (<ulink url="http://www.haskell.org/onlinereport/decls.html#sect4.3.4">Haskell Report, Section 4.3.4</ulink>) +If no <literal>default</literal> declaration is given, then it is just as if the module +contained the declaration <literal>default( Integer, Double, String)</literal>. +</para></listitem> + +<listitem><para> +The standard defaulting rule is extended thus: defaulting applies when all the unresolved constraints involve standard classes <emphasis>or</emphasis> <literal>IsString</literal>; and at least one is a numeric class <emphasis>or</emphasis> <literal>IsString</literal>. </para></listitem> </itemizedlist> +So, for example, the expression <literal>length "foo"</literal> will give rise +to an ambiguous use of <literal>IsString a0</literal> which, becuase of the above +rules, will default to <literal>String</literal>. </para> <para> A small example: @@ -5942,28 +6042,39 @@ instance (GMapKey a, GMapKey b) => GMapKey (Either a b) where data GMap (Either a b) v = GMapEither (GMap a v) (GMap b v) ... -instance (Eq (Elem [e])) => Collects ([e]) where +instance Eq (Elem [e]) => Collects [e] where type Elem [e] = e ... </programlisting> - The most important point about associated family instances is that the - type indexes corresponding to class parameters must be identical to - the type given in the instance head; here this is the first argument - of <literal>GMap</literal>, namely <literal>Either a b</literal>, - which coincides with the only class parameter. - </para> - <para> - Instances for an associated family can only appear as part of - instance declarations of the class in which the family was declared - - just as with the equations of the methods of a class. Also in - correspondence to how methods are handled, declarations of associated - types can be omitted in class instances. If an associated family - instance is omitted, the corresponding instance type is not inhabited; +Note the following points: +<itemizedlist> +<listitem><para> + The type indexes corresponding to class parameters must have precisely the same shape + the type given in the instance head. To have the same "shape" means that + the two types are identical modulo renaming of type variables. For example: +<programlisting> +instance Eq (Elem [e]) => Collects [e] where + -- Choose one of the following alternatives: + type Elem [e] = e -- OK + type Elem [x] = x -- OK + type Elem x = x -- BAD; shape of 'x' is different to '[e]' + type Elem [Maybe x] = x -- BAD: shape of '[Maybe x]' is different to '[e]' +</programlisting> +</para></listitem> +<listitem><para> + An instances for an associated family can only appear as part of + an instance declarations of the class in which the family was declared, + just as with the equations of the methods of a class. +</para></listitem> +<listitem><para> + The instance for an associated type can be omitted in class instances. In that case, + unless there is a default instance (see <xref linkend="assoc-decl-defs"/>), + the corresponding instance type is not inhabited; i.e., only diverging expressions, such as <literal>undefined</literal>, can assume the type. - </para> - <para> - Although it is unusual, there can be <emphasis>multiple</emphasis> +</para></listitem> +<listitem><para> + Although it is unusual, there (currently) can be <emphasis>multiple</emphasis> instances for an associated family in a single instance declaration. For example, this is legitimate: <programlisting> @@ -5977,8 +6088,10 @@ instance GMapKey Flob where Since you cannot give any <emphasis>subsequent</emphasis> instances for <literal>(GMap Flob ...)</literal>, this facility is most useful when the free indexed parameter is of a kind with a finite number of alternatives - (unlike <literal>*</literal>). - </para> + (unlike <literal>*</literal>). WARNING: this facility may be withdrawn in the future. +</para></listitem> +</itemizedlist> +</para> </sect3> <sect3 id="assoc-decl-defs"> @@ -5996,22 +6109,50 @@ class IsBoolMap v where instance IsBoolMap [(Int, Bool)] where lookupKey = lookup </programlisting> -The <literal>instance</literal> keyword is optional. - </para> +In an <literal>instance</literal> declaration for the class, if no explicit +<literal>type instance</literal> declaration is given for the associated type, the default declaration +is used instead, just as with default class methods. +</para> <para> -There can also be multiple defaults for a single type, as long as they do not -overlap: +Note the following points: +<itemizedlist> +<listitem><para> + The <literal>instance</literal> keyword is optional. +</para></listitem> +<listitem><para> + There can be at most one default declaration for an associated type synonym. +</para></listitem> +<listitem><para> + A default declaration is not permitted for an associated + <emphasis>data</emphasis> type. +</para></listitem> +<listitem><para> + The default declaration must mention only type <emphasis>variables</emphasis> on the left hand side, + and the right hand side must mention only type varaibels bound on the left hand side. + However, unlike the associated type family declaration itself, + the type variables of the default instance are independent of those of the parent class. +</para></listitem> +</itemizedlist> +Here are some examples: <programlisting> -class C a where - type F a b - type F a Int = Bool - type F a Bool = Int + class C a where + type F1 a :: * + type instance F1 a = [a] -- OK + type instance F1 a = a->a -- BAD; only one default instance is allowed + + type F2 b a -- OK; note the family has more type + -- variables than the class + type instance F2 c d = c->d -- OK; you don't have to use 'a' in the type instance + + type F3 a + type F3 [b] = b -- BAD; only type variables allowed on the LHS + + type F4 a + type F4 b = a -- BAD; 'a' is not in scope in the RHS </programlisting> +</para> -A default declaration is not permitted for an associated -<emphasis>data</emphasis> type. - </para> - </sect3> +</sect3> <sect3 id="scoping-class-params"> <title>Scoping of class parameters</title> @@ -8039,7 +8180,7 @@ scope over the methods defined in the <literal>where</literal> part. For exampl of the Haskell Report) can be completely switched off by <option>-XNoMonomorphismRestriction</option>. Since GHC 7.8.1, the monomorphism -restriction is switched off by default in GHCi. +restriction is switched off by default in GHCi's interactive options (see <xref linkend="ghci-interactive-options"/>). </para> </sect3> @@ -8112,12 +8253,30 @@ pattern binding must have the same context. For example, this is fine: <para> An ML-style language usually generalises the type of any let-bound or where-bound variable, so that it is as polymorphic as possible. -With the flag <option>-XMonoLocalBinds</option> GHC implements a slightly more conservative policy: -<emphasis>it generalises only "closed" bindings</emphasis>. -A binding is considered "closed" if either +With the flag <option>-XMonoLocalBinds</option> GHC implements a slightly more conservative policy, +using the following rules: <itemizedlist> -<listitem><para>It is one of the top-level bindings of a module, or </para></listitem> -<listitem><para>Its free variables are all themselves closed</para></listitem> + <listitem><para> + A variable is <emphasis>closed</emphasis> if and only if + <itemizedlist> + <listitem><para> the variable is let-bound</para></listitem> + <listitem><para> one of the following holds: + <itemizedlist> + <listitem><para>the variable has an explicit type signature that has no free type variables, or</para></listitem> + <listitem><para>its binding group is fully generalised (see next bullet) </para></listitem> + </itemizedlist> + </para></listitem> + </itemizedlist> + </para></listitem> + + <listitem><para> + A binding group is <emphasis>fully generalised</emphasis> if and only if + <itemizedlist> + <listitem><para>each of its free variables is either imported or closed, and</para></listitem> + <listitem><para>the binding is not affected by the monomorphism restriction + (<ulink url="http://www.haskell.org/onlinereport/decls.html#sect4.5.5">Haskell Report, Section 4.5.5</ulink>)</para></listitem> + </itemizedlist> + </para></listitem> </itemizedlist> For example, consider <programlisting> @@ -8126,15 +8285,18 @@ g x = let h y = f y * 2 k z = z+x in h x + k x </programlisting> -Here <literal>f</literal> and <literal>g</literal> are closed because they are bound at top level. -Also <literal>h</literal> is closed because its only free variable <literal>f</literal> is closed. -But <literal>k</literal> is not closed because it mentions <literal>x</literal> which is locally bound. -Another way to think of it is this: all closed bindings <literal>could</literal> be defined at top level. -(In the example, we could move <literal>h</literal> to top level.) -</para><para> -All of this applies only to bindings that lack an explicit type signature, so that GHC has to -infer its type. If you supply a type signature, then that fixes type of the binding, end of story. -</para><para> +Here <literal>f</literal> is generalised because it has no free variables; and its binding group +is unaffected by the monomorphism restriction; and hence <literal>f</literal> is closed. +The same reasoning applies to <literal>g</literal>, except that it has one closed free variable, namely <literal>f</literal>. +Similarly <literal>h</literal> is closed, <emphasis>even though it is not bound at top level</emphasis>, +because its only free variable <literal>f</literal> is closed. +But <literal>k</literal> is not closed, because it mentions <literal>x</literal> which is not closed (because it is not let-bound). +</para> +<para> +Notice that a top-level binding that is affected by the monomorphism restriction is not closed, and hence may +in turn prevent generalisation of bindings that mention it. +</para> +<para> The rationale for this more conservative strategy is given in <ulink url="http://research.microsoft.com/~simonpj/papers/constraints/index.htm">the papers</ulink> "Let should not be generalised" and "Modular type inference with local assumptions", and a related <ulink url="http://ghc.haskell.org/trac/ghc/blog/LetGeneralisationInGhc7">blog post</ulink>. @@ -10885,8 +11047,8 @@ not be substituted, and the rule would not fire. </sect2> -<sect2 id="conlike"> -<title>How rules interact with INLINE/NOINLINE and CONLIKE pragmas</title> +<sect2 id="rules-inline"> +<title>How rules interact with INLINE/NOINLINE pragmas</title> <para> Ordinary inlining happens at the same time as rule rewriting, which may lead to unexpected @@ -10912,7 +11074,14 @@ would have been a better chance that <literal>f</literal>'s RULE might fire. The way to get predictable behaviour is to use a NOINLINE pragma, or an INLINE[<replaceable>phase</replaceable>] pragma, on <literal>f</literal>, to ensure that it is not inlined until its RULEs have had a chance to fire. +The warning flag <option>-fwarn-inline-rule-shadowing</option> (see <xref linkend="options-sanity"/>) +warns about this situation. </para> +</sect2> + +<sect2 id="conlike"> +<title>How rules interact with CONLIKE pragmas</title> + <para> GHC is very cautious about duplicating work. For example, consider <programlisting> @@ -11257,69 +11426,6 @@ program even if fusion doesn't happen. More rules in <filename>GHC/List.lhs</fi </sect2> -<sect2 id="core-pragma"> - <title>CORE pragma</title> - - <indexterm><primary>CORE pragma</primary></indexterm> - <indexterm><primary>pragma, CORE</primary></indexterm> - <indexterm><primary>core, annotation</primary></indexterm> - -<para> - The external core format supports <quote>Note</quote> annotations; - the <literal>CORE</literal> pragma gives a way to specify what these - should be in your Haskell source code. Syntactically, core - annotations are attached to expressions and take a Haskell string - literal as an argument. The following function definition shows an - example: - -<programlisting> -f x = ({-# CORE "foo" #-} show) ({-# CORE "bar" #-} x) -</programlisting> - - Semantically, this is equivalent to: - -<programlisting> -g x = show x -</programlisting> -</para> - -<para> - However, when external core is generated (via - <option>-fext-core</option>), there will be Notes attached to the - expressions <function>show</function> and <varname>x</varname>. - The core function declaration for <function>f</function> is: -</para> - -<programlisting> - f :: %forall a . GHCziShow.ZCTShow a -> - a -> GHCziBase.ZMZN GHCziBase.Char = - \ @ a (zddShow::GHCziShow.ZCTShow a) (eta::a) -> - (%note "foo" - %case zddShow %of (tpl::GHCziShow.ZCTShow a) - {GHCziShow.ZCDShow - (tpl1::GHCziBase.Int -> - a -> - GHCziBase.ZMZN GHCziBase.Char -> GHCziBase.ZMZN GHCziBase.Cha -r) - (tpl2::a -> GHCziBase.ZMZN GHCziBase.Char) - (tpl3::GHCziBase.ZMZN a -> - GHCziBase.ZMZN GHCziBase.Char -> GHCziBase.ZMZN GHCziBase.Cha -r) -> - tpl2}) - (%note "bar" - eta); -</programlisting> - -<para> - Here, we can see that the function <function>show</function> (which - has been expanded out to a case expression over the Show dictionary) - has a <literal>%note</literal> attached to it, as does the - expression <varname>eta</varname> (which used to be called - <varname>x</varname>). -</para> - -</sect2> - </sect1> <sect1 id="special-ids"> @@ -11613,7 +11719,7 @@ described in <ulink url="http://www.seas.upenn.edu/~sweirich/papers/popl163af-weirich.pdf">Generative type abstraction and type-level computation</ulink>, published at POPL 2011.</para> -<sect2> +<sect2 id="nominal-representational-and-phantom"> <title>Nominal, Representational, and Phantom</title> <para>The goal of the roles system is to track when two types have the same @@ -11670,7 +11776,7 @@ are unrelated.</para> </sect2> -<sect2> +<sect2 id="role-inference"> <title>Role inference</title> <para> @@ -11724,7 +11830,7 @@ but role nominal for <literal>b</literal>.</para> </sect2> -<sect2> +<sect2 id="role-annotations"> <title>Role annotations <indexterm><primary>-XRoleAnnotations</primary></indexterm> </title> diff --git a/docs/users_guide/gone_wrong.xml b/docs/users_guide/gone_wrong.xml index 114b06cfd6..bb5fcb0d4e 100644 --- a/docs/users_guide/gone_wrong.xml +++ b/docs/users_guide/gone_wrong.xml @@ -146,7 +146,7 @@ <emphasis>must</emphasis> be re-compiled.</para> <para>A useful option to alert you when interfaces change is - <option>-hi-diffs</option><indexterm><primary>-hi-diffs + <option>-ddump-hi-diffs</option><indexterm><primary>-ddump-hi-diffs option</primary></indexterm>. It will run <command>diff</command> on the changed interface file, before and after, when applicable.</para> @@ -167,7 +167,7 @@ <screen> % rm *.o # scrub your object files -% make my_prog # re-make your program; use -hi-diffs to highlight changes; +% make my_prog # re-make your program; use -ddump-hi-diffs to highlight changes; # as mentioned above, use -dcore-lint to be more paranoid % ./my_prog ... # retry... </screen> diff --git a/docs/users_guide/phases.xml b/docs/users_guide/phases.xml index db32f38870..8a5589acda 100644 --- a/docs/users_guide/phases.xml +++ b/docs/users_guide/phases.xml @@ -576,8 +576,22 @@ $ cat foo.hspp</screen> </term> <listitem> <para>Omit code generation (and all later phases) - altogether. Might be of some use if you just want to see - dumps of the intermediate compilation phases.</para> + altogether. This is useful if you're only interested in + type checking code.</para> + </listitem> + </varlistentry> + + <varlistentry> + <term> + <option>-fwrite-interface</option> + <indexterm><primary><option>-fwrite-interface</option></primary></indexterm> + </term> + <listitem> + <para>Always write interface files. GHC will normally write + interface files automatically, but this flag is useful with + <option>-fno-code</option>, which normally suppresses generation + of interface files. This is useful if you want to type check + over multiple runs of GHC without compiling dependencies.</para> </listitem> </varlistentry> diff --git a/docs/users_guide/ug-book.xml.in b/docs/users_guide/ug-book.xml.in index dc5d4f7c35..b87563ac3b 100644 --- a/docs/users_guide/ug-book.xml.in +++ b/docs/users_guide/ug-book.xml.in @@ -17,7 +17,6 @@ &lang-features; &ffi-chap; &extending-ghc; -&external-core; &wrong; &utils; &win32-dll; diff --git a/docs/users_guide/ug-ent.xml.in b/docs/users_guide/ug-ent.xml.in index ce87089f24..6753ff7e5b 100644 --- a/docs/users_guide/ug-ent.xml.in +++ b/docs/users_guide/ug-ent.xml.in @@ -3,7 +3,7 @@ <!ENTITY flags SYSTEM "flags.xml"> <!ENTITY license SYSTEM "license.xml"> <!ENTITY intro SYSTEM "intro.xml" > -<!ENTITY relnotes1 SYSTEM "7.8.1-notes.xml" > +<!ENTITY relnotes1 SYSTEM "7.10.1-notes.xml" > <!ENTITY using SYSTEM "using.xml" > <!ENTITY code-gens SYSTEM "codegens.xml" > <!ENTITY runtime SYSTEM "runtime_control.xml" > @@ -12,7 +12,6 @@ <!ENTITY sooner SYSTEM "sooner.xml" > <!ENTITY lang-features SYSTEM "lang.xml" > <!ENTITY glasgowexts SYSTEM "glasgow_exts.xml" > -<!ENTITY external-core SYSTEM "external_core.xml" > <!ENTITY packages SYSTEM "packages.xml" > <!ENTITY parallel SYSTEM "parallel.xml" > <!ENTITY safehaskell SYSTEM "safe_haskell.xml" > diff --git a/docs/users_guide/using.xml b/docs/users_guide/using.xml index 8d8211eb5a..921d5a3345 100644 --- a/docs/users_guide/using.xml +++ b/docs/users_guide/using.xml @@ -899,20 +899,37 @@ ghci> :set -fprint-explicit-foralls ghci> :t f f :: forall a. a -> a </screen> - Using <option>-fprint-explicit-kinds</option> makes GHC print kind-foralls and kind applications +However, regardless of the flag setting, the quantifiers are printed under these circumstances: +<itemizedlist> +<listitem><para>For nested <literal>foralls</literal>, e.g. +<screen> +ghci> :t GHC.ST.runST +GHC.ST.runST :: (forall s. GHC.ST.ST s a) -> a +</screen> +</para></listitem> +<listitem><para>If any of the quantified type variables has a kind +that mentions a kind variable, e.g. +<screen> +ghci> :i Data.Coerce.coerce +coerce :: + forall (k :: BOX) (a :: k) (b :: k). Coercible a b => a -> b + -- Defined in GHC.Prim +</screen> +</para></listitem> +</itemizedlist> + </para> + <para> + Using <option>-fprint-explicit-kinds</option> makes GHC print kind arguments in types, which are normally suppressed. This can be important when you are using kind polymorphism. For example: <screen> ghci> :set -XPolyKinds ghci> data T a = MkT ghci> :t MkT -MkT :: T b +MkT :: forall (k :: BOX) (a :: k). T a ghci> :set -fprint-explicit-foralls ghci> :t MkT -MkT :: forall (b::k). T b -ghci> :set -fprint-explicit-kinds -ghci> :t MkT -MkT :: forall (k::BOX) (b:k). T b +MkT :: forall (k :: BOX) (a :: k). T k a </screen> </para> </listitem> @@ -1719,15 +1736,50 @@ f "2" = 2 <indexterm><primary>unused binds, warning</primary></indexterm> <indexterm><primary>binds, unused</primary></indexterm> <para>Report any function definitions (and local bindings) - which are unused. For top-level functions, the warning is - only given if the binding is not exported.</para> - <para>A definition is regarded as "used" if (a) it is exported, or (b) it is - mentioned in the right hand side of another definition that is used, or (c) the - function it defines begins with an underscore. The last case provides a - way to suppress unused-binding warnings selectively. </para> - <para> Notice that a variable - is reported as unused even if it appears in the right-hand side of another - unused binding. </para> + which are unused. More precisely: + + <itemizedlist> + <listitem><para>Warn if a binding brings into scope a variable that is not used, + except if the variable's name starts with an underscore. The "starts-with-underscore" + condition provides a way to selectively disable the warning. + </para> + <para> + A variable is regarded as "used" if + <itemizedlist> + <listitem><para>It is exported, or</para></listitem> + <listitem><para>It appears in the right hand side of a binding that binds at + least one used variable that is used</para></listitem> + </itemizedlist> + For example + <programlisting> +module A (f) where +f = let (p,q) = rhs1 in t p -- Warning about unused q +t = rhs3 -- No warning: f is used, and hence so is t +g = h x -- Warning: g unused +h = rhs2 -- Warning: h is only used in the right-hand side of another unused binding +_w = True -- No warning: _w starts with an underscore + </programlisting> + </para></listitem> + + <listitem><para> + Warn if a pattern binding binds no variables at all, unless it is a lone, possibly-banged, wild-card pattern. + For example: + <programlisting> +Just _ = rhs3 -- Warning: unused pattern binding +(_, _) = rhs4 -- Warning: unused pattern binding +_ = rhs3 -- No warning: lone wild-card pattern +!_ = rhs4 -- No warning: banged wild-card pattern; behaves like seq + </programlisting> + The motivation for allowing lone wild-card patterns is they + are not very different from <literal>_v = rhs3</literal>, + which elicits no warning; and they can be useful to add a type + constraint, e.g. <literal>_ = x::Int</literal>. A lone + banged wild-card pattern is is useful as an alternative + (to <literal>seq</literal>) way to force evaluation. + </para> + </listitem> + </itemizedlist> + </para> </listitem> </varlistentry> @@ -1814,6 +1866,16 @@ f "2" = 2 </listitem> </varlistentry> + <varlistentry> + <term><option>-fwarn-inline-rule-shadowing</option>:</term> + <listitem> + <indexterm><primary><option>-fwarn-inline-rule-shadowing</option></primary></indexterm> + <para>Warn if a rewrite RULE might fail to fire because the function might be + inlined before the rule has a chance to fire. See <xref linkend="rules-inline"/>. + </para> + </listitem> + </varlistentry> + </variablelist> <para>If you're feeling really paranoid, the @@ -2967,44 +3029,6 @@ data D = D !C </sect1> &runtime; - -<sect1 id="ext-core"> - <title>Generating and compiling External Core Files</title> - - <indexterm><primary>intermediate code generation</primary></indexterm> - - <para>GHC can dump its optimized intermediate code (said to be in “Core” format) - to a file as a side-effect of compilation. Non-GHC back-end tools can read and process Core files; these files have the suffix - <filename>.hcr</filename>. The Core format is described in <ulink url="../../core.pdf"> - <citetitle>An External Representation for the GHC Core Language</citetitle></ulink>, - and sample tools - for manipulating Core files (in Haskell) are available in the - <ulink url="http://hackage.haskell.org/package/extcore">extcore package on Hackage</ulink>. Note that the format of <literal>.hcr</literal> - files is <emphasis>different</emphasis> from the Core output format that GHC generates - for debugging purposes (<xref linkend="options-debugging"/>), though the two formats appear somewhat similar.</para> - - <para>The Core format natively supports notes which you can add to - your source code using the <literal>CORE</literal> pragma (see <xref - linkend="pragmas"/>).</para> - - <variablelist> - - <varlistentry> - <term> - <option>-fext-core</option> - <indexterm><primary><option>-fext-core</option></primary></indexterm> - </term> - <listitem> - <para>Generate <literal>.hcr</literal> files.</para> - </listitem> - </varlistentry> - - </variablelist> - -<para>Currently (as of version 6.8.2), GHC does not have the ability to read in External Core files as source. If you would like GHC to have this ability, please <ulink url="http://ghc.haskell.org/trac/ghc/wiki/MailingListsAndIRC">make your wishes known to the GHC Team</ulink>.</para> - -</sect1> - &debug; &flags; |