PFPL Chapter 17 Abstract Types

Posted on 2019-04-11 In PFPL Note Disqus:

Please refer this link from page 151 to 158.

Data Abstraction and Existential Types

Data abstraction has great importance for structuring programs. The main idea is to introduce an interface that serves as a contract between the client and implementor of an abstract type. The interface specifies

what client relies on to continue its work.
what implementor must provide to satisfy the contract.

The interface serves to isolate the client from implementor so that they can be developed isolated. This property is called representation independence for abstract type.

Existential types are introduced in extending language F in formalizing data abstraction.

Interfaces are existential types that provide a collection of operations on unspecified, or abstract types.
Implementations are packages, introduction forms of existential types.
Clients are uses of corresponding elimination form.

Existential Types

The syntax of FE extends F with
$$
\begin{align}
\text{Typ }\tau::=&&\exists(t.\tau)&&\exists(t.\tau)&&\text{interface}\\
\text{Exp }e::=&&\text{pack}\lbrace t.\tau\rbrace\lbrace\rho\rbrace(e)&&\text{pack }\rho\text{ with }e\text{ as }\exists(t.\tau)&&\text{implementation}\\
&&\text{open}\lbrace t.\tau\rbrace\lbrace\rho\rbrace(e_1;t,x.e_2)&&\text{open }e_1\text{ as }t\text{ with }x:\tau\text{ in }e_2&&\text{client}
\end{align}
$$
Introduction form $\exists(t.\tau)$ is a package of form $\text{pack }\rho\text{ with }e\text{ as }\exists(t.\tau)$, where $\rho\text{ type}$ and $e:\lbrack\rho/t\rbrack\tau$.

$\rho$ is the representation type of the package
$e$ is the implementation of the package

Elimination form $\text{open }e_1\text{ as }t\text{ with }x:\tau\text{ in }e_2$, which opens package $e_1$ within client $e_2$ by binding its representation type to $t$ and its implementation to $x$.

Statics

$$
\displaylines{\dfrac{\Delta,t\text{ type}\vdash\tau\text{ type}}{\Delta\vdash\exists(t.\tau)\text{ type}}\\
\dfrac{\Delta\vdash\rho\text{ type}\phantom{“”}\Delta,t\text{ type}\vdash\tau\text{ type}\phantom{“”}\Delta\space\Gamma\vdash e:\lbrack\rho/t\rbrack\tau}{\Delta\space\Gamma\vdash\text{pack}\lbrace t.\tau\rbrace\lbrace\rho\rbrace(e):\exists(t.\tau)}\\
\dfrac{\Delta\space\Gamma\vdash e_1:\exists(t.\tau)\phantom{“”}\Delta,t\text{ type }\Gamma,x:\tau\vdash e_2:\tau_2\phantom{“”}\Delta\vdash\tau_2\text{ type}}{\Delta\space\Gamma\vdash\text{open}\lbrace t.\tau\rbrace\lbrace\tau_2\rbrace(e_1;t,x.e_2):\tau_2}}
$$

For the last rule, 2 things need to be noticed:

$\tau_2$ as the type of client, must not involve abstract type $t$. The restriction prevents the clients from export a value of abstract type out of scope of definition.

$e_2$ as the body of client, is type checked without knowledge of the representation type $t$, which means client is polymorphic in type variable $t$.

Regularity Lemma

Suppose $\Delta\space\Gamma\vdash e:\tau$. If $\Delta\vdash\tau_i\text{ type}$ for each $x_i:\tau_i$ in $\Gamma$, then $\Delta\vdash\tau\text{ type}$.

Dynamics

$$
\displaylines{\dfrac{\lbrack e\text{ val}\rbrack}{\text{pack}\lbrace t.\tau\rbrace\lbrace\rho\rbrace(e)\text{ val}}\\
\Big\lbrack\dfrac{e\longmapsto e’}{\text{pack}\lbrace t.\tau\rbrace\lbrace\rho\rbrace(e)\longmapsto\text{pack}\lbrace t.\tau\rbrace\lbrace\rho\rbrace(e’)}\Big\rbrack\\
\dfrac{e_1\longmapsto e_1’}{\text{open}\lbrace t.\tau\rbrace\lbrace\tau_2\rbrace(e_1;t,x.e_2)\longmapsto\text{open}\lbrace t.\tau\rbrace\lbrace\tau_2\rbrace(e_1’;t,x.e_2)}\\
\dfrac{\lbrack e\text{ val}\rbrack}{\text{open}\lbrace t.\tau\rbrace\lbrace\tau_2\rbrace(\text{pack}\lbrace t.\tau\rbrace\lbrace\rho\rbrace(e);t,x.e_2)\longmapsto\lbrack\rho,e/t,x\rbrack e_2}}
$$

NO ABSTRACT TYPES AT RUN TIME!

Since the representation type is propagated to client by substitution when package is opened, thus eliminates the abstraction boundary between client and implementor.

Safety

Preservation

If $e:\tau$ and $e\longmapsto e’$, then $e’:\tau$.

Canonical Forms

If $e:\exists(t.\tau)$ and $e\text{ val}$, then $e=\text{pack}\lbrace t.\tau\rbrace\lbrace\rho\rbrace(e’)$ for some type $\rho$ and some $e’:\lbrack\rho/t\rbrack\tau$.

Progress

If $e:\tau$, then either $e\text{ val}$ or $\exists e’,e\longmapsto e’$.

Data Abstraction

Example first into FE by showing abstract type of queues of natural numbers with

Forming empty queue
Inserting an element at the tail of the queue
Removing the head of the queue (assumed non-empty)

The behavior of the queue can be expressed by the existential type $\exists(t.\tau)$ for interface of queue abstraction.
$$
\exists(t.\langle\text{emp}\hookrightarrow t,\text{ins}\hookrightarrow\text{nat}\times t\to t,\text{rem}\hookrightarrow t\to(\text{nat}\times t)\text{ opt}\rangle)
$$

All we know is that it supports operations emp ins rem without knowing the exact type since t is abstract.

An example for queue as $e_l$ can be $\text{pack natlist with }\langle\text{emp}\hookrightarrow\text{nil},\text{ins}\hookrightarrow e_i,\text{rem}\hookrightarrow e_r\rangle\text{ as }\exists(t.\tau)$ where

$e_i:\text{nat}\times\text{natlist}\to\text{natlist}$
$e_r:\text{natlist}\to\text{nat}\times\text{natlist}$

These operations know that queues are represented as values of type natlist and programmed accordingly.

Another example, which is
$$
\text{pack natlist}\times\text{natlist with }\langle\text{emp}\hookrightarrow\langle\text{nil},\text{nil}\rangle,\text{ins}\hookrightarrow e_i,\text{rem}\hookrightarrow e_r\rangle\text{ as }\exists(t.\tau)
$$

$e_i:\text{nat}\times(\text{natlist}\times\text{natlist})\to\text{natlist}\times\text{natlist}$
$e_r:(\text{natlist}\times\text{natlist})\to\text{nat}\times(\text{natlist}\times\text{natlist})$

Important point is that same client type checks regardless of which implementation of queues we choose, because representation type is hidden, or held abstract from the client during type checking.

THE CLIENT IS INDEPENDENT OF REPRESENTATION OF ABSTRACT TYPE.

Definability of Existential Types

FE is not a proper extension of F since existential types are definable in terms of universal types $\forall$.

We can see that client, $\text{open}\lbrace t.\tau\rbrace\lbrace\tau_2\rbrace(e_1;t,x.e_2)$ has $e_1:\exists(t.\tau)$ and $e_2:\tau_2$, has a polymorphic type $\forall(t.\tau\to\tau_2)$, since $t$ may occur in $\tau$ but not in $\tau_2$.
$$
\begin{align}
\exists(t.\tau)&\triangleq\forall(u.\forall(t.\tau\to u)\to u)\\
\text{pack }\rho\text{ with }e\text{ as }\exists(t.\tau)&\triangleq\Lambda(u)\space\lambda(x:\forall(t.\tau\to u))\space x\lbrack\rho\rbrack(e)\\
\text{open }e_1\text{ as }t\text{ with }x:\tau\text{ in }e_2&\triangleq e_1\lbrack\tau_2\rbrack(\Lambda(t)\space\lambda(x:\tau)\space e_2)
\end{align}
$$

$u$ is the overall result type, which stands for the $\lbrack\rho/t\rbrack\tau$ here.

Coinductive Stream

$$
\begin{align}
\text{stream}&\triangleq\exists(t.(t\to(\text{nat}\times t))\times t)\\
\text{strgen }x\text{ is }e\text{ in }\langle\text{hd}\hookrightarrow e_0,\text{tl}\hookrightarrow e_1\rangle&\triangleq\text{pack }\tau\text{ with }\langle \lambda(x:\tau)\space\langle e_0,e_1\rangle,e\rangle\text{ as stream}\\
\text{hd}(e)&\triangleq\text{open }e\text{ as }t\text{ with }\langle f:t\to(\text{nat}\times t),x:t\rangle\text{ in }f(x)\cdot\text{l}\\
\text{tl}(e)&\triangleq\text{open }e\text{ as }t\text{ with }\langle f:t\to(\text{nat}\times t),x:t\rangle\text{ in }\dots\text{ where}\\
\dots&\triangleq\text{strgen }x\text{ is }f(x)\cdot\text{r}\text{ in }\langle\text{hd}\hookrightarrow f(x)\cdot\text{l},\text{tl}\hookrightarrow f(x)\cdot\text{r}\rangle
\end{align}
$$

General Coinductive Type

We generalize the previous coinductive stream
$$
\begin{align}
\nu(t.\tau)&\triangleq\exists(t.(t\to\tau)\times t)\\
\text{gen}\lbrace.\rbrace(x.e’;e)&\triangleq\text{pack }\sigma\text{ with }\langle\lambda(x:t)e’,e\rangle\text{ as }\nu(t.\tau)\\
\text{unfold}\lbrace\rbrace(e)&\triangleq\text{open }e\text{ as }t\text{ with }\langle g:t\to\tau,x:t\rangle\text{ in }\dots\text{ where}\\
\dots&\triangleq\text{map}\lbrace t.\tau\rbrace(y.\text{gen}\lbrace.\rbrace(k.n(k);y))(n(x))
\end{align}
$$

Representation Independence

An important consequence of parametricity is ensuring insensitive clients ($e_2$) to representations of abstract types.

More precisely, there is a criterion, bi-similarity, for relating 2 implementations of an abstract type such that the behavior of a client is unaffected by swapping one implementation by another that is bi-similar to it.

In this way, if we want to prove the correctness of a candidate implementation, we just require it to be bi-similar to an correct reference implementation.

Since candidate and reference implementation are bi-similar, no client may distinguish them from one another.

We can see that client $c$ of $\exists(t.\tau)$ has polymorphic type $\forall(t.\tau\to\tau_2)$, where $t$ doesn’t occur free in $\tau_2$.

According to Chapter 16, if $R$ is a bi-simulation relation between any 2 implementation of abstract type, then client behaves identically on them.

In fact, $t$ does not occur free in $\tau_2$ ensures that behavior of client is independent of choice between implementation, provided that the relation is preserved by operation implement it.

Consider the existential type $\exists(t.\tau)$ where $\tau$ specifies the abstract types of queues.
$$
\langle\text{emp}\hookrightarrow t,\text{ins}\hookrightarrow\text{nat}\times t\to t,\text{rem}\hookrightarrow t\to(\text{nat}\times t)\text{ opt}\rangle
$$
If $\rho$ and $\rho’$ are any two closed types, and if $R$ is a relation between expressions of $\rho$ and $\rho’$, then if implementations $e:\lbrack\rho/x\rbrack\tau$ and $e’:\lbrack\rho’/x\rbrack\tau$ respect $R$, then $c\lbrack\rho\rbrack(e)$ behaves the same as $c\lbrack\rho’\rbrack(e’)$.

Let $e\triangleq\langle\text{emp}\hookrightarrow e_m,\text{ins}\hookrightarrow e_i,\text{rem}\hookrightarrow e_r\rangle$ and $e’\triangleq\langle\text{emp}\hookrightarrow e_m’,\text{ins}\hookrightarrow e_i’,\text{rem}\hookrightarrow e_r’\rangle$.

To respect $R$ for these implementations ($e$ and $e’$), three conditions required:

Empty queues are related: $R(e_m,e_m’)$.
Inserting same element on related queues yield related queues: If $d:\tau$ and $R(q,q’)$, then $R(e_i\langle d,q\rangle,e_i’\langle d,q’\rangle)$.
If two queues are related, then only two situations on using rem
- $e_r(q)\cong\text{null}\cong e_r’(q’)$
- $e_r(q)\cong\text{just}(\langle d,r\rangle)$ and $e_r’(q’)\cong\text{just}(\langle d’,r’\rangle)$ with $d\cong d’$ and $R(r,r’)$.

The terminology stems from requirement that the operations of the abstract type preserve the relation: if it holds before an operation is performed, then it must also hold afterwards.

Thus each implementation simulates the other up to the relationship specified by $R$.

Natural Number Queue For Bi-similar

Previously we talked about the queue existential type, then we use natlist as $\rho$ to get following code
$$
\begin{align}
t&\triangleq\text{natlist}\\
\text{emp}&\triangleq\text{nil}\\
\text{ins}&\triangleq\lambda(x:\text{nat})\space\lambda(q:t)\space\text{cons}(x;q)\\
\text{rem}&\triangleq\lambda(q:t)\space\text{case rev}(q)\lbrace\text{nil}\hookrightarrow\text{null}\shortmid\text{cons}(f;qr)\hookrightarrow\text{just}\langle f,\text{rev}(qr)\rangle\rbrace
\end{align}
$$
Also we can see that we can use natlist x natlist as $\rho$, the code change to
$$
\begin{align}
t&\triangleq\text{natlist}\times\text{natlist}\\
\text{emp}&\triangleq\langle\text{nil},\text{nil}\rangle\\
\text{ins}&\triangleq\lambda(x:\text{nat})\space\lambda(\langle bs,fs\rangle:t)\space\langle\text{cons}(x;bs),fs\rangle\\
\text{rem}&\triangleq\lambda(\langle bs,fs\rangle:t)\space\text{case }fs\lbrace\text{nil}\hookrightarrow e\shortmid\text{cons}(f;fs’)\hookrightarrow\text{just}\langle f,\langle bs,fs’\rangle\rangle\rbrace, \text{where}\\
e&\triangleq\text{case rev}(bs)\lbrace\text{nil}\hookrightarrow\text{null}\shortmid\text{cons}(b;bs’)\hookrightarrow\text{just}\langle b,\langle\text{nil},bs’\rangle\rangle\rbrace
\end{align}
$$

Parametricity for Poly Types to RI

Representation independence for abstract types is a corollary of the parametricity theorem for polymorphic types.

Recall definition of $\exists(t.\tau)$ in FE as $\forall(u.\forall(t.\tau\to u)\to u)$ in F, the type of abstract type of queue we discussed has polymorphic type
$$
\forall(u.\forall(t.\tau_{queue}\to u)\to u)
$$
where $\tau_{queue}$ is of type
$$
\langle\text{emp}\hookrightarrow t,\text{ins}\hookrightarrow\text{nat}\times t\to t,\text{rem}\hookrightarrow t\to(\text{nat}\times t)\text{ opt}\rangle
$$
For any choice of $\rho$ of result type, the client of the abstraction is polymorphic type $\forall(t.\tau_{queue}\to\rho)$, where $\rho$ is fixed. Consider a binary relation $R$ and check if it coincides with the conditions.

Assume the identity relation on $\rho$ yields the desired result that the two implementations of queues are observably indistinguishable in FE.