PFPL Chapter 17 Abstract Types
Please refer this link from page 151 to 158.
Data Abstraction and Existential Types
Data abstraction has great importance for structuring programs. The main idea is to introduce an interface that serves as a contract between the client and implementor of an abstract type. The interface specifies
- what client relies on to continue its work.
- what implementor must provide to satisfy the contract.
The interface serves to isolate the client from implementor so that they can be developed isolated. This property is called representation independence for abstract type.
Existential types are introduced in extending language F in formalizing data abstraction.
- Interfaces are existential types that provide a collection of operations on unspecified, or abstract types.
- Implementations are packages, introduction forms of existential types.
- Clients are uses of corresponding elimination form.
Existential Types
The syntax of FE extends F with
$$
\begin{align}
\text{Typ }\tau::=&&\exists(t.\tau)&&\exists(t.\tau)&&\text{interface}\\
\text{Exp }e::=&&\text{pack}\lbrace t.\tau\rbrace\lbrace\rho\rbrace(e)&&\text{pack }\rho\text{ with }e\text{ as }\exists(t.\tau)&&\text{implementation}\\
&&\text{open}\lbrace t.\tau\rbrace\lbrace\rho\rbrace(e_1;t,x.e_2)&&\text{open }e_1\text{ as }t\text{ with }x:\tau\text{ in }e_2&&\text{client}
\end{align}
$$
Introduction form $\exists(t.\tau)$ is a package of form $\text{pack }\rho\text{ with }e\text{ as }\exists(t.\tau)$, where $\rho\text{ type}$ and $e:\lbrack\rho/t\rbrack\tau$.
- $\rho$ is the representation type of the package
- $e$ is the implementation of the package
Elimination form $\text{open }e_1\text{ as }t\text{ with }x:\tau\text{ in }e_2$, which opens package $e_1$ within client $e_2$ by binding its representation type to $t$ and its implementation to $x$.
Statics
$$
\displaylines{\dfrac{\Delta,t\text{ type}\vdash\tau\text{ type}}{\Delta\vdash\exists(t.\tau)\text{ type}}\\
\dfrac{\Delta\vdash\rho\text{ type}\phantom{“”}\Delta,t\text{ type}\vdash\tau\text{ type}\phantom{“”}\Delta\space\Gamma\vdash e:\lbrack\rho/t\rbrack\tau}{\Delta\space\Gamma\vdash\text{pack}\lbrace t.\tau\rbrace\lbrace\rho\rbrace(e):\exists(t.\tau)}\\
\dfrac{\Delta\space\Gamma\vdash e_1:\exists(t.\tau)\phantom{“”}\Delta,t\text{ type }\Gamma,x:\tau\vdash e_2:\tau_2\phantom{“”}\Delta\vdash\tau_2\text{ type}}{\Delta\space\Gamma\vdash\text{open}\lbrace t.\tau\rbrace\lbrace\tau_2\rbrace(e_1;t,x.e_2):\tau_2}}
$$
For the last rule, 2 things need to be noticed:
- $\tau_2$ as the type of client, must not involve abstract type $t$. The restriction prevents the clients from export a value of abstract type out of scope of definition.
- $e_2$ as the body of client, is type checked without knowledge of the representation type $t$, which means client is polymorphic in type variable $t$.
Regularity Lemma
Suppose $\Delta\space\Gamma\vdash e:\tau$. If $\Delta\vdash\tau_i\text{ type}$ for each $x_i:\tau_i$ in $\Gamma$, then $\Delta\vdash\tau\text{ type}$.
Dynamics
$$
\displaylines{\dfrac{\lbrack e\text{ val}\rbrack}{\text{pack}\lbrace t.\tau\rbrace\lbrace\rho\rbrace(e)\text{ val}}\\
\Big\lbrack\dfrac{e\longmapsto e’}{\text{pack}\lbrace t.\tau\rbrace\lbrace\rho\rbrace(e)\longmapsto\text{pack}\lbrace t.\tau\rbrace\lbrace\rho\rbrace(e’)}\Big\rbrack\\
\dfrac{e_1\longmapsto e_1’}{\text{open}\lbrace t.\tau\rbrace\lbrace\tau_2\rbrace(e_1;t,x.e_2)\longmapsto\text{open}\lbrace t.\tau\rbrace\lbrace\tau_2\rbrace(e_1’;t,x.e_2)}\\
\dfrac{\lbrack e\text{ val}\rbrack}{\text{open}\lbrace t.\tau\rbrace\lbrace\tau_2\rbrace(\text{pack}\lbrace t.\tau\rbrace\lbrace\rho\rbrace(e);t,x.e_2)\longmapsto\lbrack\rho,e/t,x\rbrack e_2}}
$$
NO ABSTRACT TYPES AT RUN TIME!
Since the representation type is propagated to client by substitution when package is opened, thus eliminates the abstraction boundary between client and implementor.
Safety
Preservation
If $e:\tau$ and $e\longmapsto e’$, then $e’:\tau$.
Canonical Forms
If $e:\exists(t.\tau)$ and $e\text{ val}$, then $e=\text{pack}\lbrace t.\tau\rbrace\lbrace\rho\rbrace(e’)$ for some type $\rho$ and some $e’:\lbrack\rho/t\rbrack\tau$.
Progress
If $e:\tau$, then either $e\text{ val}$ or $\exists e’,e\longmapsto e’$.
Data Abstraction
Example first into FE by showing abstract type of queues of natural numbers with
- Forming empty queue
- Inserting an element at the tail of the queue
- Removing the head of the queue (assumed non-empty)
The behavior of the queue can be expressed by the existential type $\exists(t.\tau)$ for interface of queue abstraction.
$$
\exists(t.\langle\text{emp}\hookrightarrow t,\text{ins}\hookrightarrow\text{nat}\times t\to t,\text{rem}\hookrightarrow t\to(\text{nat}\times t)\text{ opt}\rangle)
$$
All we know is that it supports operations
emp
ins
rem
without knowing the exact type sincet
is abstract.
An example for queue as $e_l$ can be $\text{pack natlist with }\langle\text{emp}\hookrightarrow\text{nil},\text{ins}\hookrightarrow e_i,\text{rem}\hookrightarrow e_r\rangle\text{ as }\exists(t.\tau)$ where
- $e_i:\text{nat}\times\text{natlist}\to\text{natlist}$
- $e_r:\text{natlist}\to\text{nat}\times\text{natlist}$
These operations know that queues are represented as values of type
natlist
and programmed accordingly.
Another example, which is
$$
\text{pack natlist}\times\text{natlist with }\langle\text{emp}\hookrightarrow\langle\text{nil},\text{nil}\rangle,\text{ins}\hookrightarrow e_i,\text{rem}\hookrightarrow e_r\rangle\text{ as }\exists(t.\tau)
$$
- $e_i:\text{nat}\times(\text{natlist}\times\text{natlist})\to\text{natlist}\times\text{natlist}$
- $e_r:(\text{natlist}\times\text{natlist})\to\text{nat}\times(\text{natlist}\times\text{natlist})$
Important point is that same client type checks regardless of which implementation of queues we choose, because representation type is hidden, or held abstract from the client during type checking.
THE CLIENT IS INDEPENDENT OF REPRESENTATION OF ABSTRACT TYPE.
Definability of Existential Types
FE is not a proper extension of F since existential types are definable in terms of universal types $\forall$.
We can see that client, $\text{open}\lbrace t.\tau\rbrace\lbrace\tau_2\rbrace(e_1;t,x.e_2)$ has $e_1:\exists(t.\tau)$ and $e_2:\tau_2$, has a polymorphic type $\forall(t.\tau\to\tau_2)$, since $t$ may occur in $\tau$ but not in $\tau_2$.
$$
\begin{align}
\exists(t.\tau)&\triangleq\forall(u.\forall(t.\tau\to u)\to u)\\
\text{pack }\rho\text{ with }e\text{ as }\exists(t.\tau)&\triangleq\Lambda(u)\space\lambda(x:\forall(t.\tau\to u))\space x\lbrack\rho\rbrack(e)\\
\text{open }e_1\text{ as }t\text{ with }x:\tau\text{ in }e_2&\triangleq e_1\lbrack\tau_2\rbrack(\Lambda(t)\space\lambda(x:\tau)\space e_2)
\end{align}
$$
$u$ is the overall result type, which stands for the $\lbrack\rho/t\rbrack\tau$ here.
Coinductive Stream
$$
\begin{align}
\text{stream}&\triangleq\exists(t.(t\to(\text{nat}\times t))\times t)\\
\text{strgen }x\text{ is }e\text{ in }\langle\text{hd}\hookrightarrow e_0,\text{tl}\hookrightarrow e_1\rangle&\triangleq\text{pack }\tau\text{ with }\langle \lambda(x:\tau)\space\langle e_0,e_1\rangle,e\rangle\text{ as stream}\\
\text{hd}(e)&\triangleq\text{open }e\text{ as }t\text{ with }\langle f:t\to(\text{nat}\times t),x:t\rangle\text{ in }f(x)\cdot\text{l}\\
\text{tl}(e)&\triangleq\text{open }e\text{ as }t\text{ with }\langle f:t\to(\text{nat}\times t),x:t\rangle\text{ in }\dots\text{ where}\\
\dots&\triangleq\text{strgen }x\text{ is }f(x)\cdot\text{r}\text{ in }\langle\text{hd}\hookrightarrow f(x)\cdot\text{l},\text{tl}\hookrightarrow f(x)\cdot\text{r}\rangle
\end{align}
$$
General Coinductive Type
We generalize the previous coinductive stream
$$
\begin{align}
\nu(t.\tau)&\triangleq\exists(t.(t\to\tau)\times t)\\
\text{gen}\lbrace.\rbrace(x.e’;e)&\triangleq\text{pack }\sigma\text{ with }\langle\lambda(x:t)e’,e\rangle\text{ as }\nu(t.\tau)\\
\text{unfold}\lbrace\rbrace(e)&\triangleq\text{open }e\text{ as }t\text{ with }\langle g:t\to\tau,x:t\rangle\text{ in }\dots\text{ where}\\
\dots&\triangleq\text{map}\lbrace t.\tau\rbrace(y.\text{gen}\lbrace.\rbrace(k.n(k);y))(n(x))
\end{align}
$$
Representation Independence
An important consequence of parametricity is ensuring insensitive clients ($e_2$) to representations of abstract types.
More precisely, there is a criterion, bi-similarity, for relating 2 implementations of an abstract type such that the behavior of a client is unaffected by swapping one implementation by another that is bi-similar to it.
In this way, if we want to prove the correctness of a candidate implementation, we just require it to be bi-similar to an correct reference implementation.
Since candidate and reference implementation are bi-similar, no client may distinguish them from one another.
We can see that client $c$ of $\exists(t.\tau)$ has polymorphic type $\forall(t.\tau\to\tau_2)$, where $t$ doesn’t occur free in $\tau_2$.
According to Chapter 16, if $R$ is a bi-simulation relation between any 2 implementation of abstract type, then client behaves identically on them.
In fact, $t$ does not occur free in $\tau_2$ ensures that behavior of client is independent of choice between implementation, provided that the relation is preserved by operation implement it.
Consider the existential type $\exists(t.\tau)$ where $\tau$ specifies the abstract types of queues.
$$
\langle\text{emp}\hookrightarrow t,\text{ins}\hookrightarrow\text{nat}\times t\to t,\text{rem}\hookrightarrow t\to(\text{nat}\times t)\text{ opt}\rangle
$$
If $\rho$ and $\rho’$ are any two closed types, and if $R$ is a relation between expressions of $\rho$ and $\rho’$, then if implementations $e:\lbrack\rho/x\rbrack\tau$ and $e’:\lbrack\rho’/x\rbrack\tau$ respect $R$, then $c\lbrack\rho\rbrack(e)$ behaves the same as $c\lbrack\rho’\rbrack(e’)$.
Let $e\triangleq\langle\text{emp}\hookrightarrow e_m,\text{ins}\hookrightarrow e_i,\text{rem}\hookrightarrow e_r\rangle$ and $e’\triangleq\langle\text{emp}\hookrightarrow e_m’,\text{ins}\hookrightarrow e_i’,\text{rem}\hookrightarrow e_r’\rangle$.
To respect $R$ for these implementations ($e$ and $e’$), three conditions required:
- Empty queues are related: $R(e_m,e_m’)$.
- Inserting same element on related queues yield related queues: If $d:\tau$ and $R(q,q’)$, then $R(e_i\langle d,q\rangle,e_i’\langle d,q’\rangle)$.
- If two queues are related, then only two situations on using
rem
- $e_r(q)\cong\text{null}\cong e_r’(q’)$
- $e_r(q)\cong\text{just}(\langle d,r\rangle)$ and $e_r’(q’)\cong\text{just}(\langle d’,r’\rangle)$ with $d\cong d’$ and $R(r,r’)$.
The terminology stems from requirement that the operations of the abstract type preserve the relation: if it holds before an operation is performed, then it must also hold afterwards.
Thus each implementation simulates the other up to the relationship specified by $R$.
Natural Number Queue For Bi-similar
Previously we talked about the queue
existential type, then we use natlist
as $\rho$ to get following code
$$
\begin{align}
t&\triangleq\text{natlist}\\
\text{emp}&\triangleq\text{nil}\\
\text{ins}&\triangleq\lambda(x:\text{nat})\space\lambda(q:t)\space\text{cons}(x;q)\\
\text{rem}&\triangleq\lambda(q:t)\space\text{case rev}(q)\lbrace\text{nil}\hookrightarrow\text{null}\shortmid\text{cons}(f;qr)\hookrightarrow\text{just}\langle f,\text{rev}(qr)\rangle\rbrace
\end{align}
$$
Also we can see that we can use natlist x natlist
as $\rho$, the code change to
$$
\begin{align}
t&\triangleq\text{natlist}\times\text{natlist}\\
\text{emp}&\triangleq\langle\text{nil},\text{nil}\rangle\\
\text{ins}&\triangleq\lambda(x:\text{nat})\space\lambda(\langle bs,fs\rangle:t)\space\langle\text{cons}(x;bs),fs\rangle\\
\text{rem}&\triangleq\lambda(\langle bs,fs\rangle:t)\space\text{case }fs\lbrace\text{nil}\hookrightarrow e\shortmid\text{cons}(f;fs’)\hookrightarrow\text{just}\langle f,\langle bs,fs’\rangle\rangle\rbrace, \text{where}\\
e&\triangleq\text{case rev}(bs)\lbrace\text{nil}\hookrightarrow\text{null}\shortmid\text{cons}(b;bs’)\hookrightarrow\text{just}\langle b,\langle\text{nil},bs’\rangle\rangle\rbrace
\end{align}
$$
Parametricity for Poly Types to RI
Representation independence for abstract types is a corollary of the parametricity theorem for polymorphic types.
Recall definition of $\exists(t.\tau)$ in FE as $\forall(u.\forall(t.\tau\to u)\to u)$ in F, the type of abstract type of queue
we discussed has polymorphic type
$$
\forall(u.\forall(t.\tau_{queue}\to u)\to u)
$$
where $\tau_{queue}$ is of type
$$
\langle\text{emp}\hookrightarrow t,\text{ins}\hookrightarrow\text{nat}\times t\to t,\text{rem}\hookrightarrow t\to(\text{nat}\times t)\text{ opt}\rangle
$$
For any choice of $\rho$ of result type, the client of the abstraction is polymorphic type $\forall(t.\tau_{queue}\to\rho)$, where $\rho$ is fixed. Consider a binary relation $R$ and check if it coincides with the conditions.
Assume the identity relation on $\rho$ yields the desired result that the two implementations of queues are observably indistinguishable in FE.