Revamp part of the impredicativity document

author: Alejandro Serrano <trupill@gmail.com> 2015-08-03 11:45:22 +0200
committer: Alejandro Serrano <trupill@gmail.com> 2015-08-03 11:45:22 +0200
commit: 6f59156da0d397c48b3444334731bcd2dde4a799 (patch)
tree: dedbcbd98104bb9916fa014a29dcc7bb7df32026
parent: b3154e7d34986fbba403ed00c4ea6b694c220107 (diff)
download: haskell-6f59156da0d397c48b3444334731bcd2dde4a799.tar.gz
1 files changed, 158 insertions, 75 deletions
diff --git a/docs/types/impredicativity.ltx b/docs/types/impredicativity.ltx
index 579dd4f851..63e5a68c17 100644
--- a/docs/types/impredicativity.ltx
+++ b/docs/types/impredicativity.ltx
@@ -74,7 +74,7 @@ Family-free types & $\xi$ & $\coloncolonequals$ & $\alpha \alt a \alt \xi_1 \to
 \label{types}
 \end{figure}
 
-\section*{Changes to constraint solving}
+\section{Changes to constraint solving}
 
 \begin{figure}
 $$
@@ -112,25 +112,32 @@ $$
 
 A subset of the canonicalization rules is given in Figures \ref{fig:sim} and \ref{fig:leq}. The only missing ones are those related to flattening of constraints.
 
-\subsubsection*{Notes on the {\sc [$\leq$lcon]} and {\sc [$\leq$l$\forall$]} rules}
+\subsection{Canonicalization, variables and type families}
 
+\paragraph{Unification variables.} We disallow applying canonicalization in the case of unification variables in the right-hand side. If we did so, and later that variable was substituted by some other type, we would need to remember the instantiation done by this rule and apply it to the substituted value. Instead, we prefer to defer the instantiation of the constraint until the variable is changed to another type. The same reasoning applies to disallow application of the rule when the right-hand side is a type family.
+
+For the left-hand side, we always look at a constraint $\alpha \leq \sigma$ as $\alpha \sim \sigma$. This conveys the following philosophy of impredicativity in our system:
 \begin{itemize}
-\item We disallow applying these rules in the case of unification variables in the right-hand side. If we did so, and later that variable was substituted by some other type, we would need to remember the instantiation done by this rule and apply it to the substituted value. Instead, we prefer to defer the instantiation of the constraint until the variable is changed to another type. The same reasoning applies to disallow application of the rule when the right-hand side is a type family.
+\item The system is impredicative in the sense that a polymorphic function may be instantiated at a polytype. Or more operationally, a unification variable may be unified with a polytype.
+\item But it does not follow that after being filled in with a polytype, a unification variable can then be instantiated.
+\end{itemize}
+Instead of producing $\alpha \leq \sigma$ constraints which are later changed to $\alpha \sim \sigma$ and could be affected by the flow of unification, both canonicalization along with generation and propagation work very hard to ensure that no constraint $\alpha \leq \sigma$, with $\alpha$ a unification variable, is ever produced.
 
-Prior, we disallowed application of the rules to any variable in the right-hand side, unification or Skolem. However, when one tries to type check the following code:
+\paragraph{Skolem variables.} Prior, we disallowed application of the rules to any variable in the right-hand side, unification or Skolem. However, when one tries to type check this code:
 \begin{verbatim}
 g :: Monoid b => b
 g = mempty
 \end{verbatim}
 the only constraint being generated is $(\forall\, a.\, \textit{Monoid} \; a \Rightarrow a) \leq b$. To go further, we need to instantiate. In this case it is safe to do so, since $b$ is never going to be unified.
 
-\item We also disallow applying the rules to quantified types. In that case, we want to apply {\sc [$\leq$r$\forall$]} first, which may in turn lead to an application of the other rules inside the implication.
+\paragraph{Type families.} There is one caveat in our discussion of unification variables: there would still be possible to produce $\alpha \leq \sigma$ is produced from a constraint involving a type family $\mathsf{F} \; \alpha \leq \sigma$ whose family resolves to a variable $\mathsf{F} \; \alpha \sim \alpha$. Our solution is to treat type families as unification variables in this case. Thus, we do not generate $\mathsf{F} \; \overline{\mu} \leq \sigma$ either, we use $\sim$ directly.
 
-\item Previously, these rules along with generation and propagation worked very hard to ensure that no constraint $\alpha \leq \sigma$, with $\alpha$ a unification variable, was ever produced.
+There is another important design choice regarding type families: whether they are allowed to resolve to $\sigma$-types or just to monomorphic types. At first sight, it seems that there is no reason to restrict type families. We could think of a family such that $\mathsf{F} \; a \sim a \to a$, which applied to $\forall b. \, b$, gives rise to the constraint $\mathsf{F} , (\forall b. b) \sim (\forall b. \, b) \to (\forall b. \, b)$. This means that we should not instantiate when we find a type family on the right-hand side, either.
 
-Nonetheless, there was still the chance that $\alpha \leq \sigma$ is produced from a constraint involving a type family $\mathsf{F} \; \alpha \leq \sigma$ whose family resolves to a variable $\mathsf{F} \; \alpha \sim \alpha$. Since we do not want to loose confluence, we ensure that this resolves to $\alpha \sim \sigma$ regardless of the order of application of rule {\sc [$\leq$lcon]} and type family rewriting.
+As a conclusion, we need to treat type families as unification variables in all circumstances involving $\leq$ constraints. We have introduced a new syntactic category, $\upsilon$ types, to recall this distinction in both the grammar and the rules.
 
-Furthermore, rule {\sc [$\leq$lcon]} is key in not getting stuck in some programs. Consider the following code, taken from the {\tt GHC.List} module:
+\paragraph{Examples where $\upsilon$ types matter.}
+The following examples show why it is important to generate $\sim$ constraints instead of $\leq$ ones when the left-hand side has a $\forall$ with a unification variable or type family. Consider the following code, taken from the {\tt GHC.List} module:
 \begin{verbatim}
 head :: [a] -> a
 head (x:xs) = x
@@ -139,63 +146,62 @@ head []     = badHead
 badHead :: b
 badHead = error "..."
 \end{verbatim}
-When type checking the second branch, we generate a constraint of the form $\forall b. b \leq a$. When we apply rule {\sc [$\leq$l$\forall$]}, we get a constraint $\beta \leq a$. If we could not apply {\sc [$\forall$lcon]} here, we would be stuck. In the current system, we resolve the constraint to $\beta \sim a$.
-
-The rule {\sc [$\leq$lcon]} applied to type variables in the left-hand side is also key in being able to type check the {\tt g = mempty} example above. Without it, we would rewrite $(\forall\, a.\, \textit{Monoid} \; a \Rightarrow a) \leq b$ to $\textit{Monoid} \; \alpha \wedge \alpha \leq b$. But this disallows progress, we want $\alpha \sim b$ instead.
-\end{itemize}
-
-\subsubsection*{Design choice: rules for $\to$}
+When type checking the second branch, we generate a constraint of the form $\forall b. b \leq a$. Suppose we would apply a canonicalization rule which would give $\beta \leq a$. We would be stuck, because $\beta$ or $a$ will not get any further unification. In the current system, we use the rule {\sc [$\leq$l$\forall\upsilon$r$\psi$]} and resolve the constraint to $\beta \sim a$.
 
-\noindent Additionally, we may have these special rules for $\to$, based on the covariance and contravariance of each of the argument positions:
-$$
-\begin{array}{lrcl}
-\textsc{[$\leq\to$alt1]} & canon\left[(\sigma_1 \to \sigma_2) \leq (\sigma_3 \to \sigma_4)\right] & = & \sigma_1 \leq \sigma_3 \wedge \sigma_2 \leq \sigma_4 \\
-\textsc{[$\leq\to$alt2]} & canon\left[(\sigma_1 \to \sigma_2) \leq (\sigma_3 \to \sigma_4)\right] & = & \sigma_1 \sim \sigma_3 \wedge \sigma_2 \leq \sigma_4
-\end{array}
-$$
-But it seems that we lose the ability to infer the type for {\tt runST \$ e}.
-
-\subsubsection*{Design choice: rules for type families}
+\
 
-\noindent There is important design choice regarding type families: whether they are allowed to resolve to $\sigma$-types or just to monomorphic types.
+\noindent The same operation is also key in being able to type check the following code:
+\begin{verbatim}
+g :: Monoid b => b
+g = mempty
+\end{verbatim}
+Without it, we would rewrite $(\forall\, a.\, \textit{Monoid} \; a \Rightarrow a) \leq b$ to $\textit{Monoid} \; \alpha \wedge \alpha \leq b$. But this disallows progress, we want $\alpha \sim b$ instead.
 
-At first sight, it seems that there is no reason to restrict type families. We could think of a family such that $\mathsf{F} \; a \sim a \to a$, which applied to $\forall b. \, b$, gives rise to the constraint $\mathsf{F} , (\forall b. b) \sim (\forall b. \, b) \to (\forall b. \, b)$. However, this means that we should not apply rule {\sc [$\leq$l$\forall$]} when faced with a type family application, since we do not yet know whether it may resolve to a $\sigma$-type. We need to wait, as we do with unification variables.
+\
 
-Not applying {\sc [$\leq$l$\forall$]} rule means that we cannot discharge some constraints that we would like to, though. Take the following code:
+\noindent This rule is important in checking some programs with type families. Take the following:
 \begin{verbatim}
 type family F a
 
 f :: a -> F a
 f x = undefined
 \end{verbatim}
-The only constraint generated is $\forall b. \, b \leq \mathsf{F} \; a$. Given that we cannot canonicalize further, the code will be refused by the compiler with an {\tt Undischarged forall b.b <$\sim$ F a} error.
+The only constraint generated is $\forall b. \, b \leq \mathsf{F} \; a$. Usually, $\upsilon$ types at the right-hand side do not canonicalize further. Thus, the code would be refused by the compiler with an {\tt Undischarged forall b.b <$\sim$ F a} error. However, we have the rule {\sc [$\leq$l$\forall\upsilon$r$\upsilon$]}, which rewrites that constraint to $\beta \sim \mathsf{F} \; a$.
 
-Our solution is to apply {\sc [$\leq$l$\forall$]} also in the presence of type families, but restrict the shape of what a type family can resolve to so that a $\sigma$-type is never produced from type family rewriting. GHC already disallows writing declarations such as {\tt type instance G Bool = forall a.a -> a}, where quantified types appear in the right-hand side. The only missing part is restricting arguments to type families to be monomorphic types, something that we do in the grammar via a new syntactic category $\mu$.
+\subsection{Design choice: rules for $\to$}
 
-Note: there is still an unclear edge. We might have $\sigma \leq \mathsf{F} \; \alpha$ with $\mathsf{F} \; \alpha \sim \alpha$. In this case we might apply {\sc [$\leq$l$\forall$]} or not depending on whether we resolve type families before or after the canonicalization phase.
+\noindent Additionally, we may have these special rules for $\to$, based on the covariance and contravariance of each of the argument positions:
+$$
+\begin{array}{lrcl}
+\textsc{[$\leq\to$alt1]} & canon\left[(\sigma_1 \to \sigma_2) \leq (\sigma_3 \to \sigma_4)\right] & = & \sigma_1 \leq \sigma_3 \wedge \sigma_2 \leq \sigma_4 \\
+\textsc{[$\leq\to$alt2]} & canon\left[(\sigma_1 \to \sigma_2) \leq (\sigma_3 \to \sigma_4)\right] & = & \sigma_1 \sim \sigma_3 \wedge \sigma_2 \leq \sigma_4
+\end{array}
+$$
+But it seems that we lose the ability to infer the type for {\tt runST \$ e}.
 
-\subsection*{Evidence generation for $\leq$}
+\subsection{Evidence generation for $\leq$}
 
 In the constraint solving process we do not only need to find a solution for the constraints, but also generate evidence of the solving. Remember that the evidence for a constraint $\sigma_1 \leq \sigma_2$ is a function $\sigma_1 \to \sigma_2$.
 
-\paragraph{Rule {\sc [$\leq$refl]}.} We need to build $W :: \sigma \to \sigma$. This is easy:
-$$W = \lambda x. \; x$$
-
-\paragraph{Rule {\sc [$\leq$lcon]}.} We need to build $W_1 :: \sigma_1 \to \sigma_2$. For this, we can use the evidence $W_2 :: \sigma_1 \sim \sigma_2$ generated by later solving steps. In this case the solution is to make:
+\paragraph{Rule {\sc [$\leq$l$\tau$r$\tau$]}.} We need to build $W_1 :: \tau_1 \to \tau_2$. For this, we can use the evidence $W_2 :: \tau_1 \sim \tau_2$ generated by later solving steps. In this case the solution is to make:
 $$W_1 = \lambda (x :: \sigma_1). \; x \rhd W_2$$
-where $\rhd$ is the cast operator which applies a coercion $\sigma_a \sim \sigma_b$ to a value of type $\sigma_a$ to produce the same value, but typed as $\sigma_b$.
+where $\rhd$ is the cast operator which applies a coercion $\tau_a \sim \tau_b$ to a value of type $\tau_a$ to produce the same value, but typed as $\tau_b$.
 
-\paragraph{Rule {\sc [$\leq$l$\forall$]}.} We need to build $W_1 :: (\forall \overline{a}. Q_1 \Rightarrow \sigma_1) \to \psi_2$ given $W_2 :: [\overline{a \mapsto \alpha}]\sigma_1 \to \psi_2$ and $W_3 :: [\overline{a \mapsto \alpha}]Q_1$. The first step is to get $[\overline{a \mapsto \alpha}]\sigma_1$ from $\forall \overline{a}. Q_1 \Rightarrow \sigma_1$, to do that we need to make a type application and afterwards apply the witness for $Q_1 \Rightarrow \sigma_1$:
-$$\lambda (x :: \forall \overline{a}. Q_1 \Rightarrow \sigma_1). \; x \; \overline{\alpha} \; W_3 :: (\forall \overline{a}. Q_1 \Rightarrow \sigma_1) \to [\overline{a \mapsto \alpha}]\sigma_1$$
+\paragraph{Rule {\sc [$\leq$l$\forall\psi$r$\psi$]}.} We need to build $W_1 :: (\forall \overline{a}. Q_1 \Rightarrow \psi^*_1) \to \psi_2$ given $W_2 :: [\overline{a \mapsto \alpha}]\psi^*_1 \to \psi_2$ and $W_3 :: [\overline{a \mapsto \alpha}]Q_1$. The first step is to get $[\overline{a \mapsto \alpha}]\psi^*_1$ from $\forall \overline{a}. Q_1 \Rightarrow \psi^*_1$, to do that we need to make a type application and afterwards apply the witness for $Q_1 \Rightarrow \sigma_1$:
+$$\lambda (x :: \forall \overline{a}. Q_1 \Rightarrow \psi^*_1). \; x \; \overline{\alpha} \; W_3 :: (\forall \overline{a}. Q_1 \Rightarrow \psi^*_1) \to [\overline{a \mapsto \alpha}]\psi^*_1$$
 The last step is then applying $W_2$ to convert it to our desired type:
-$$W_1 = \lambda (x :: \forall \overline{a}. Q_1 \Rightarrow \sigma_1). \; W_2 \; (x \; \overline{\alpha} \; W_3)$$
+$$W_1 = \lambda (x :: \forall \overline{a}. Q_1 \Rightarrow \psi^*_1). \; W_2 \; (x \; \overline{\alpha} \; W_3)$$
+
+\paragraph{Rules {\sc [$\leq$l$\forall\upsilon$r$\psi$]} and {\sc [$\leq$l$\forall\upsilon$r$\upsilon$]}.} These cases are very similar to the previous one. The only difference is that instead of evidence for another $\leq$ constraint, $W_2$ is now a coercion. In the most general case we get:
+$$W_1 = \lambda (x :: \forall \overline{a}. Q_1 \Rightarrow \upsilon_1). (x \; \overline{\alpha} \; W_3) \rhd W_2$$
+where in the case {\sc [$\leq$l$\forall\upsilon$r$\upsilon$]} there is not application of $W_3$.
 
 \paragraph{Rule {\sc [$\leq$r$\forall$]}.} This is the most complicated rule for which to generate evidence. As usual, we want to generate evidence $W_1 :: \sigma_1 \to (\forall \overline{a}. Q_2 \Rightarrow \sigma_2)$. In order to do so, we can use the evidence generated by solving $\forall \overline{a}. \, (Q_2 \supset \sigma_1 \leq \sigma_2)$. In GHC, this implication generates two pieces of information: evidence $W_2 :: \sigma_1 \to \sigma_2$, and a series of bindings which might use the data in $Q_2$ and which make $W_2$ correct. We shall denote those bindings by $\square$.
 
 In this case, we need to generate something whose type is $\forall \overline{a}. Q_2 \Rightarrow \dots$. Thus, we need first a series of type abstractions and evidence abstractions. Then, we can just apply $W_2$ remembering to bring the bindings into scope.
 $$W_1 = \lambda (x :: \sigma_1). \; \overline{\Lambda a}. \; \lambda (d :: Q_2). \; \textsf{let} \; \square \; \textsf{in} \; W_2 \; x$$
 
-\subsection*{Instantiation of arguments of type families and classes}
+\subsection{Instantiation of arguments of type families and classes}
 
 Suppose we want to typecheck the following piece of code:
 \begin{verbatim}
@@ -232,7 +238,7 @@ Using the rule of always instantiating, the result of $\gamma \sim \mathsf{F} \;
 For example, in the previous case we want to apply the axiom $\forall e. \mathtt{Eq} \; e \Rightarrow \mathtt{Eq} \; [e]$, and thus we need to instantiate $a$. We return as residual constraints $\mathtt{Eq} \; \xi \wedge \mathtt{Eq} \; \alpha \sim \mathtt{Eq} \; \mathrm{[}\xi\mathrm{]}$, and the solver takes care of the rest, that is, $\forall a. \mathrm{[}a\mathrm{]} \leq \mathrm{[}\xi\mathrm{]}$.
 \end{enumerate}
 
-\subsection*{Generalization and defaulting}
+\subsection{Generalization and defaulting}
 
 One nasty side-effect of this approach to impredicativity is that the solver may produce non-Haskell 2010 types. For example, when inferring the type of {\tt singleton id}, where $singleton :: \forall a. \, a \to [a]$ and $id :: \forall a. \, a \to a$, the result would be $\forall a. (\forall b. b \to b) \leq a \Rightarrow [a]$. In short, we want to get rid of the $\leq$ constraints once a binding has been found to type check. This process is part of a larger one which in GHC is known as \emph{approximation}.
 
@@ -267,17 +273,27 @@ $$
 We prefer this option for the approximation phase in inference, since it leads to types which are more similar to those already inferred by GHC. Note that this approximation only applies to unannotated top-level bindings: the user can always ask to give $[\forall a. \, a \to a]$ as a type for {\tt singleton id} via an annotation.
 \end{itemize}
 
-\section*{Constraint generation}
+\section{Constraint generation}
 
-\subsection*{Without propagation}
+\subsection{Without propagation}
 
 In the $\Gamma \vdash e : \sigma \leadsto C$ judgement, $\Gamma$ and $e$ are inputs, whereas $\sigma$ and $C$ are outputs. The highlighted parts are changes with respect to the constraint generation judgement in the original {\sc OutsideIn(X)} system.
 
+As discussed in the canonicalization rules, we need to ensure that no constraint of the form $\alpha \leq \sigma$ or $\mathsf{F} \; \overline{\mu} \leq \sigma$ is produced. For that matter, we introduce the following operation:
+$$
+\sigma_1 \preccurlyeq \sigma_2 =
+\begin{cases}
+\sigma_1 \sim \sigma_2 & \textrm{if } \sigma_1 \textrm{ is a } \upsilon \textrm{ type} \\
+\sigma_1 \leq \sigma_2 & \textrm{otherwise}
+\end{cases}
+$$
+With this operation and the canonicalization rules, we can guarantee that the $\leq$ constraints inside the system have only the form $\psi^*_1 \leq \sigma$.
+
 \begin{prooftree}
 \AxiomC{$\alpha$ fresh}
 \AxiomC{$x : \sigma \in \Gamma$}
 \RightLabel{\sc VarCon}
-\BinaryInfC{$\Gamma \vdash x : \alpha \leadsto \highlight{\sigma \leq \alpha}$}
+\BinaryInfC{$\Gamma \vdash x : \alpha \leadsto \highlight{\sigma \preccurlyeq \alpha}$}
 \end{prooftree}
 
 \begin{prooftree}
@@ -302,12 +318,12 @@ In the $\Gamma \vdash e : \sigma \leadsto C$ judgement, $\Gamma$ and $e$ are inp
 \TrinaryInfC{$\Gamma \vdash e_1 \, e_2 : \alpha \leadsto C_1 \wedge C_2 \wedge \tau_1 \sim \tau_2 \to \alpha$}
 \end{prooftree}
 
-\highlighttree{
-\begin{prooftree}
-\AxiomC{$\Gamma \vdash e : \tau_2 \leadsto C$}
-\RightLabel{\sc Annot}
-\UnaryInfC{$\Gamma \vdash (e :: \sigma_1) : \sigma_1 \leadsto C \wedge \tau_2 \leq \sigma_1$}
-\end{prooftree}}
+%\highlighttree{
+%\begin{prooftree}
+%\AxiomC{$\Gamma \vdash e : \tau_2 \leadsto C$}
+%\RightLabel{\sc Annot}
+%\UnaryInfC{$\Gamma \vdash (e :: \sigma_1) : \sigma_1 \leadsto C \wedge \tau_2 \leq \sigma_1$}
+%\end{prooftree}}
 
 \begin{prooftree}
 \AxiomC{$\Gamma \vdash e_1 : \tau_1 \leadsto C_1$}
@@ -343,7 +359,7 @@ $}
 \UnaryInfC{$\Gamma \vdash \textsf{case } e \textsf{ of } \{ \overline{K_i \, \overline{x_i} \to u_i} \} : \beta \leadsto C \wedge \textsf{T} \, \overline{\gamma} \sim \tau \wedge \bigwedge C'_i$}
 \end{prooftree}
 
-\subsection*{With propagation}
+\subsection{With propagation}
 
 \noindent We use propagation to cover two main scenarios:
 \begin{itemize}
@@ -362,7 +378,7 @@ f = \x. -> (x 1, x True)
 \begin{prooftree}
 \AxiomC{$x : \sigma_1 \in \Gamma$}
 \RightLabel{\sc VarCon}
-\UnaryInfC{$\Gamma \vdash_\Downarrow x : \sigma_2 \leadsto \sigma_1 \leq \sigma_2$}
+\UnaryInfC{$\Gamma \vdash_\Downarrow x : \sigma_2 \leadsto \sigma_1 \preccurlyeq \sigma_2$}
 \end{prooftree}
 
 \begin{prooftree}
@@ -405,7 +421,7 @@ f = \x. -> (x 1, x True)
 $\theta = [\overline{a \to \alpha}] \qquad
 \Gamma \vdash_\Downarrow e_i : \theta(\sigma_i) \leadsto C_i$}
 \RightLabel{\sc AppFun}
-\UnaryInfC{$\Gamma \vdash_\Downarrow f \, e_1 \, \dots \, e_n : \sigma \leadsto \bigwedge_i C_i \, \wedge \, \theta(Q) \, \wedge \, \theta(\sigma_r) \leq \sigma$}
+\UnaryInfC{$\Gamma \vdash_\Downarrow f \, e_1 \, \dots \, e_n : \sigma \leadsto \bigwedge_i C_i \, \wedge \, \theta(Q) \, \wedge \, \theta(\sigma_r) \preccurlyeq \sigma$}
 \end{prooftree}
 
 \begin{prooftree}
@@ -416,11 +432,11 @@ $\theta = [\overline{a \to \alpha}] \qquad
 \TrinaryInfC{$\Gamma \vdash_\Downarrow e_1 \, e_2 : \sigma \leadsto C_1 \, \wedge \, C_2$}
 \end{prooftree}
 
-\begin{prooftree}
-\AxiomC{$\Gamma \vdash_\Downarrow e : \sigma_1 \leadsto C$}
-\RightLabel{\sc Annot}
-\UnaryInfC{$\Gamma \vdash_\Downarrow (e :: \sigma_1) : \sigma_2 \leadsto C \, \wedge \, \sigma_1 \leq \sigma_2$}
-\end{prooftree}
+%\begin{prooftree}
+%\AxiomC{$\Gamma \vdash_\Downarrow e : \sigma_1 \leadsto C$}
+%\RightLabel{\sc Annot}
+%\UnaryInfC{$\Gamma \vdash_\Downarrow (e :: \sigma_1) : \sigma_2 \leadsto C \, \wedge \, \sigma_1 \leq \sigma_2$}
+%\end{prooftree}
 
 \begin{prooftree}
 \AxiomC{$\alpha$ fresh}
@@ -464,20 +480,20 @@ $}
 
 % The reason for including a block of $f_i$s is to cover cases such as {\tt runST $ do ...}, or more clearly {\tt ($) runST (do ...)}, where some combinators are used between functions. Should the rules only include the case $f \, e_1 \, \dots \, f_m$, the fairly common {\tt runST $ do ...} could not be typed without an annotation.
 
-\subsection*{Why (not) $:_\lambda$?}
+\subsection{Why (not) $:_\lambda$?}
 
-In this section, we describe the thought process that led to the rule {\sc [$\leq$lcon]} to be applied to unification variables in the left-hand side, and not only to types headed by a constructor, as we first thought.
+In this section, we describe (part of) the thought process that led to unification variables in the left-hand side to always be regarded as $\sim$ constraints. At first, the conversion from $\leq$ to $\sim$ only occured when the left-hand side involved a type headed by a constructor.
 
 \
 
-In principle, the only rule that would need to change is that of variables in the term level, which is the point in which instantiation may happen:
+At the very beginning, the only rule that seemed to be in need for a change is that of variables in the term level, which is the point in which instantiation may happen:
 \begin{prooftree}
 \AxiomC{$\alpha$ fresh}
 \AxiomC{$x : \sigma \in \Gamma$}
 \RightLabel{\sc VarCon}
 \BinaryInfC{$\Gamma \vdash x : \alpha \leadsto \highlight{\sigma \leq \alpha}$}
 \end{prooftree}
-Unfortunately, if {\sc [$\leq$lcon]} cannot be applied to unification variables in the left-hand side, this is not enough. Suppose we have the following piece of code:
+Unfortunately, if unification variables in the left-hand side cannot ever be changed into equalities, this is not enough. Suppose we have the following piece of code:
 \begin{verbatim}
 (\f -> (f 1, f True)) (if ... then id else not)
 \end{verbatim}
@@ -492,7 +508,7 @@ At this point we are stuck, since we have no rule that could be applied. One mig
 
 \
 
-The first solution is to make this situation impossible by generating $\alpha \sim \beta$ and $\alpha \sim \gamma$ instead of their $\leq$ counterparts. We do this by splitting the {\sc [VarCon]} rule in such a way that $\sim$ is generated when the variable comes from an unannotated abstraction or unannotated $\mathsf{let}$. The environment is responsible for keeping track of this fact for each binding, by a small tag, which we denote by $:_\lambda$ in the type rules.
+Our goal turned into making this situation impossible by generating $\alpha \sim \beta$ and $\alpha \sim \gamma$ upfront instead of their $\leq$ counterparts. We did this by splitting the {\sc [VarCon]} rule in such a way that $\sim$ is generated when the variable comes from an unannotated abstraction or unannotated $\mathsf{let}$ (places where we knew that a fresh variable is generated). The environment is responsible for keeping track of this fact for each binding, by a small tag, which we denote by $:_\lambda$ in the type rules.
 \begin{prooftree}
 \AxiomC{$\highlight{x :_\lambda \tau} \in \Gamma$}
 \RightLabel{\sc VarCon${}_\lambda$}
@@ -544,14 +560,11 @@ If we apply {\sc [AppFun]} directly, we instantiate the type of $error :: \foral
 
 \
 
-Nevertheless, you could always end up in the situation described in the notes of the {\sc [$\leq$lcon]} rule, where a type family rewrites to a type variable. It is easier to just make $\alpha \leq \sigma$ rewrite to $\alpha \sim \sigma$ in any case (but when $\sigma$ is headed by $\forall$).
-This solution goes along with the following philosophy of impredicativity in our system:
-\begin{itemize}
-\item The system is impredicative in the sense that a polymorphic function may be instantiated at a polytype. Or more operationally, a unification variable may be unified with a polytype.
-\item But it does not follow that after being filled in with a polytype, a unification variable can then be instantiated.
-\end{itemize}
+At the end, we came to the realization that we should never have $\alpha \leq \sigma$ constraints in the system as a general rule. Thus, we removed all those special cases and $:_\lambda$ annotations, and instead moved to an operation $\preccurlyeq$ which takes care of not generating $\leq$ if the left-hand side is a variable or type family.
+
+\section{Nice results}
 
-\subsection*{{\tt f \$ x} is equivalent to {\tt f x}}
+\subsection{{\tt f \$ x} is equivalent to {\tt f x}}
 
 One nice property of the system is that the rule {\sc AppFun} applied to the case {\tt f \$ x} for {\tt f} not in the environment is equivalent to the rule {\sc App} applied to {\tt f x}. In the first case we have:
 \begin{prooftree}
@@ -580,15 +593,85 @@ Which is exactly the {\sc App} rule applied to $f \; x$!
 $\theta = [\overline{a \to \alpha}] \qquad
 \Gamma \vdash_\Downarrow e : \theta(\sigma_s) \leadsto C$}
 \RightLabel{\sc App\$Fun}
-\UnaryInfC{$\Gamma \vdash_\Downarrow f \, \$ \, e : \sigma \leadsto C \, \wedge \, \theta(Q) \, \wedge \, \theta(\sigma_r) \leq \sigma$}
+\UnaryInfC{$\Gamma \vdash_\Downarrow f \, \$ \, e : \sigma \leadsto C \, \wedge \, \theta(Q) \, \wedge \, \theta(\sigma_r) \preccurlyeq \sigma$}
+\end{prooftree}
+
+\subsection{$\eta$-reduction and $\eta$-expansion}
+
+As usual in impredicative type systems, it is not always the case that you might reduce a $\lambda$-abstraction by $\eta$-reduction. The reason is that the $\lambda$-bound variable might have an annotation, which is lost when $\eta$-reducing. Our system, though, still has some nice properties when dealing with $\lambda$-abstractions.
+
+\paragraph{Checking $f$ is equivalent to checking $\lambda \, x. f \, x$.} More formally, suppose $\Gamma \vdash_{\Downarrow} f :: \forall \overline{a}. Q \Rightarrow \sigma_1 \to \sigma_2$. Then $\Gamma \vdash_{\Downarrow} \lambda \, x. f \, x :: \forall \overline{a}. \, Q \Rightarrow \sigma_1 \to \sigma_2$. There are two cases, depending on $f \in \Gamma$:
+\begin{itemize}
+\item Suppose $f :: (\forall \overline{b}. \, Q^* \Rightarrow \sigma^*_1 \to \sigma^*_2) \in \Gamma$. When checking $\Gamma \vdash_{\Downarrow} f :: \forall \overline{a}. Q \Rightarrow \sigma_1 \to \sigma_2$ we get just one constraint by the {\sc [VarCon]} rule:
+$$(\forall \overline{b}. \, Q^* \Rightarrow \sigma^*_1 \to \sigma^*_2)
+\leq (\forall \overline{a}. Q \Rightarrow \sigma_1 \to \sigma_2)$$
+After several steps of canonicalization, we get to the implication:
+$$\forall \overline{a}. \left( Q \supset \theta(Q^*) \wedge \theta(\sigma^*_1) \sim \sigma_1 \wedge \theta(\sigma^*_2) \sim \sigma_2 \right)$$
+We know that implication is true since the type checking was successful.
+
+For the $\eta$-expanded version, we apply the generation and propagation rules:
+\vspace{-0.5cm}
+\begin{prooftree}
+\AxiomC{$\Gamma, x : \sigma_1 \vdash_\Downarrow x : \sigma_1 \leadsto \sigma_1 \leq \theta'(\sigma^*_1)$}
+\RightLabel{\sc AppFun}
+\UnaryInfC{$\Gamma, x : \sigma_1 \vdash_\Downarrow f \, x : \sigma_2 \leadsto \sigma_1 \leq \theta'(\sigma^*_1) \wedge \theta'(Q^*) \wedge \theta'(\sigma_2) \leq \sigma_2$}
+\RightLabel{\sc AbsArrow}
+\UnaryInfC{$\Gamma \vdash_\Downarrow \lambda x. f \, x : \sigma_1 \to \sigma_2 \leadsto \sigma_1 \leq \theta'(\sigma^*_1) \wedge \theta'(Q^*) \wedge \theta'(\sigma_2) \leq \sigma_2$}
+\RightLabel{\sc Prop$\forall$}
+\UnaryInfC{$\Gamma \vdash_\Downarrow \lambda x. f \, x : \forall \overline{a}. Q \Rightarrow \sigma_1 \to \sigma_2 \leadsto \forall \overline{a}. \left( Q \supset \sigma_1 \leq \theta'(\sigma^*_1) \wedge \theta'(Q^*) \wedge \theta'(\sigma_2) \leq \sigma_2 \right)$}
+\end{prooftree}
+It is the case that the instantiation in both the canonicalization rules and the {\sc AppFun} propagation rule is done in the same way. Thus, we know that $\theta = \theta'$, modulo renaming of variables. Thus, the final constraint in the $\eta$-expanded case is equivalent to:
+$$\left( Q \supset \sigma_1 \leq \theta(\sigma^*_1) \wedge \theta(Q^*) \wedge \theta(\sigma_2) \leq \sigma_2 \right)$$
+This last constraint is clearly implied by the one for type checking $f$, by taking as solution for $\leq$ constraints the corresponding equality.
+
+\item Suppose now that $f \not\in \Gamma$ (maybe because it is an expression larger than a single variable). Since we know that it has been proven that $\Gamma \vdash_{\Downarrow} f :: \forall \overline{a}. Q \Rightarrow \sigma_1 \to \sigma_2$, it must have been the case that the last rule applied is {\sc Prop$\forall$}.
+\begin{prooftree}
+\AxiomC{$f :: \sigma_1 \to \sigma_2 \leadsto C$}
+\RightLabel{\sc Prop$\forall$}
+\UnaryInfC{$f :: \forall \overline{a}. \, Q \Rightarrow \sigma_1 \to \sigma_2 \leadsto \forall \overline{a}. \left( Q \supset C \right)$}
+\end{prooftree}
+Now, for the $\eta$-expanded version, we get the following typing derivation:
+\begin{prooftree}
+\AxiomC{$f :: \sigma_1 \to \sigma_2 \leadsto C$}
+\UnaryInfC{$\Gamma, x : \sigma_1 \vdash_\Downarrow f : \alpha \to \sigma \leadsto C \wedge \alpha \sim \sigma_1$}
+\AxiomC{$ $}
+\RightLabel{\sc VarCon}
+\UnaryInfC{$\Gamma, x : \sigma_1 \vdash_\Downarrow x : \alpha \leadsto \sigma_1 \leq \alpha$}
+\RightLabel{\sc App}
+\BinaryInfC{$\Gamma, x : \sigma_1 \vdash_\Downarrow f \, x : \sigma_2 \leadsto C \wedge \alpha \sim \sigma_1 \wedge \sigma_1 \leq \alpha$}
+\RightLabel{\sc AbsArrow}
+\UnaryInfC{$\Gamma \vdash_\Downarrow \lambda x. f \, x : \sigma_1 \to \sigma_2 \leadsto C \wedge \alpha \sim \sigma_1 \wedge \sigma_1 \leq \alpha$}
+\RightLabel{\sc Prop$\forall$}
+\UnaryInfC{$\Gamma \vdash_\Downarrow \lambda x. f \, x : \forall \overline{a}. \,  Q \Rightarrow \sigma_1 \to \sigma_2 \leadsto \forall \overline{a}. \left( Q \supset C \wedge \alpha \sim \sigma_1 \wedge \sigma_1 \leq \alpha \right)$}
+\end{prooftree}
+Clearly, we can make $C \wedge \alpha \sim \sigma_1 \wedge \sigma_1 \leq \alpha$ equivalent to $C$ by unification of the fresh variable $\alpha$ to $\sigma_1$. Since it is fresh, we know that it is not captured in any other part of the typing derivation. Thus, we get equivalent constraints from the $\eta$-reduced and $\eta$-expanded versions.
+\end{itemize}
+
+\paragraph{Inferring $f$ is (almost) equivalent to inferring $\lambda x. f \, x$.} When inferring, we apply the propagation rules starting with a fresh unification variable $\gamma$. In this case, the $\eta$-expanded version might not return the exact same type as the $\eta$-reduced version, but a less polymorphic version, depending on the generalization strategy taken by the compiler.
+
+Let us focus on the case when $f : \forall \overline{a}. \, Q \Rightarrow \sigma_1 \to \sigma_2 \in \Gamma$. The $\eta$-expanded version has the following derivation tree:
+\begin{prooftree}
+\AxiomC{$ $}
+\RightLabel{\sc VarCon}
+\UnaryInfC{$\Gamma, x : \alpha \vdash_\Downarrow x : \theta(\sigma_1) \leadsto \alpha \sim \theta(\sigma_1)$}
+\RightLabel{\sc AppFun}
+\UnaryInfC{$\Gamma, x : \alpha \vdash_\Downarrow f \, x : \beta \leadsto \theta(Q) \wedge \alpha \sim \theta(\sigma_1) \wedge \theta(\sigma_2) \leq \beta$}
+\RightLabel{\sc AbsVar}
+\UnaryInfC{$\Gamma \vdash_\Downarrow \lambda x. f \, x : \gamma \leadsto \gamma \sim \alpha \to \beta \wedge \theta(Q) \wedge \alpha \sim \theta(\sigma_1) \wedge \theta(\sigma_2) \leq \beta$}
 \end{prooftree}
+If $\sigma_2$ has no $\forall$ inside it, or the approximation strategy involved changing $\leq$ to $\sim$ everywhere, we get as final set of constraints:
+$$\gamma \sim (\theta(\sigma_1) \to \theta(\sigma_2)) \, \wedge \, \theta(Q)$$
+Once we generalize, we shall get back the original type we started with. However, if we approximate by instantiation and $\sigma_2$ is headed by $\forall$, we might get back a less polymorphic type. For example, if we started with:
+$$f :: (\forall a. a \to a) \to (\forall b. b \to b)$$
+Then we would have:
+$$\lambda x. f \, x :: \forall b. \left( (\forall a. a \to a) \to b \to b \right)$$
 
 %\newpage
-\section*{Examples}
-\subsection*{\tt runST e}
+\section{Examples}
+\subsection{\tt runST e}
 $$(\$)^{\alpha \to \beta \to \gamma} \; runST^\alpha \; \left( e \; :: \; \forall s. ST \; s \; Int \right)^\beta$$
 $$\begin{array}{rcl}
-\forall a, b. (a \to b) \to a \to b & \leq & \alpha \to \beta \to \gamma \\
+\forall a \, b. (a \to b) \to a \to b & \leq & \alpha \to \beta \to \gamma \\
 \forall a. \left( \forall s. ST \; s \; a \right) \to a & \leq & \alpha \\
 \forall s. ST \; s \; Int & \leq & \beta \\
 \end{array}$$
@@ -623,7 +706,7 @@ $$\begin{array}{rcl}
 \gamma & \sim & Int
 \end{array}$$
 
-\subsection*{$\eta$-expansion}
+\subsection{$\eta$-expansion}
 
 $$\begin{array}{l}
 f :: \left( \forall a. a \to a \right) \to Int \\
author	Alejandro Serrano <trupill@gmail.com>	2015-08-03 11:45:22 +0200
committer	Alejandro Serrano <trupill@gmail.com>	2015-08-03 11:45:22 +0200
commit	6f59156da0d397c48b3444334731bcd2dde4a799 (patch)
tree	dedbcbd98104bb9916fa014a29dcc7bb7df32026
parent	b3154e7d34986fbba403ed00c4ea6b694c220107 (diff)
download	haskell-6f59156da0d397c48b3444334731bcd2dde4a799.tar.gz