PFPL Chapter 19 System PCF of Recursive Functions
Please refer this link from page 167 to 176.
Intros on General Recursion and Fixpoint
System T is introduced as a basis for discussing total computations, those for which the type systems guarantees termination.
Language M generalizes T to admit inductive and coinductive types, while preserving totality.
PCF will be introduced here as a basis for discussing partial computations, those may not terminate while evaluated, even if they are well typed.
Seems like a disadvantage, it admits greater expressive power than is possible in T.
The source of partiality in PCF is the concept of general recursion, permitting solution of equations between expressions. We can see the advantages and disadvantages clearly that:
- The price of admitting solutions to all such equations is that the computation may not terminate: the solution of some equations may be divergent or undefined.
- The advantage is that termination proof need not be embedded into the code, resulting in short code.
A solution to such a system of equations is a fixed point of an associated functional (higher-order function).
Fraction as Fixpoint Example First Intro
Consider equations which define factorial function. They form a system of simultaneous equations in unknown $f$ ranges over functions on $\mathbb{N}$.
$$
\begin{aligned}
f(0)&\triangleq 1\\
f(n+1)&\triangleq(n+1)\times f(n)
\end{aligned}
$$
The function we seek is a solution to these solutions as $f:\mathbb{N}\to\mathbb{N}$ satisfying all conditions. A solution to such a system is a fixed point of an higher-order function.
We seek $f$ given by
$$
n\mapsto\begin{cases}
1&\text{if }n=0\\
n\times f(n’)&\text{if }n=n’+1
\end{cases}
$$
But when we try to define the functional, or higher-order function $F$ by equation $F(f)=f’$, where $f’$ share the same definition of $f$. Surprise.
The function $f$ we seek is a fixed point of $F$. Or so to say $f$ is defined to be $fix(F)$.
Partial and Fixpoint
The key for fixed point in functions in PCF are partial, meaning that they may diverge on some (or even all) inputs.
Consequently, a fixed point of a partial function $F$ is the limit of a series of approximations of desired solutions obtained by iterating $F$.
- Let a partial function $\phi$ on natural number, is an approximation to a total function $f$.
- Let $\bot:\mathbb{N}\rightharpoonup\mathbb{N}$ be the totally undefined partial function $\forall\space n\in \mathbb{N}$. This is the worst appriximation.
- Given any approximation $\phi$ of $f$, it may always improve to $\phi’=F(\phi)$.
- If we start with as $\bot$ and pass to $\lim_{i\geq0}F^{(i)}(\bot)$, we may obtain the least approximation for $f$ that is defined $\forall\space m\in\mathbb{N}$, hence total function $f$ itself. (If the limit exists)
The solution is given by general recursion, there is no guarantee that the solution is a total function.
Statics
$$
\begin{aligned}
\text{Typ}&&\tau&&::=&&\text{nat}&&\text{nat}&&\text{naturals}\\
&&&&&&\rightharpoonup(\tau_1;\tau_2)&&\tau_1\rightharpoonup\tau_2&&\text{partial function}\\
\text{Exp}&&e&&::=&&x&&x&&\text{variable}\\
&&&&&&z&&z&&\text{zero}\\
&&&&&&s(e)&&s(e)&&\text{successor}\\
&&&&&&\text{ifz}\lbrace e_0;x.e_1\rbrace(e)&&\text{ifz }e\lbrace z\hookrightarrow e_0\shortmid s(x)\hookrightarrow e_1\rbrace&&\text{zero test}\\
&&&&&&\lambda\lbrace\tau\rbrace(x.e)&&\lambda(x:\tau)\space e&&\text{abstraction}\\
&&&&&&\text{ap}(e_1;e_2)&&e_1(e_2)&&\text{application}\\
&&&&&&\text{fix}\lbrace\tau\rbrace(x.e)&&\text{fix }x:\tau\text{ is }e&&\text{recursion}
\end{aligned}
$$
The statics of PCF is inductively defined by following rules
$$
\displaylines{\dfrac{}{\Gamma,x:\tau\vdash x:\tau}\\
\dfrac{}{\Gamma\vdash z:\text{nat}}\\
\dfrac{\Gamma\vdash e:\text{nat}}{\Gamma\vdash s(e):\text{nat}}\\
\dfrac{\Gamma\vdash e:\text{nat}\phantom{“”}\Gamma\vdash e_0:\tau\phantom{“”}\Gamma,x:\text{nat}\vdash e_1:\tau}{\Gamma\vdash\text{ifz}\lbrace e_0;x.e_1\rbrace(e):\tau}\\
\dfrac{\Gamma,x:\tau_1\vdash e:\tau_2}{\Gamma\vdash\lambda\lbrace\tau_1\rbrace(x.e):\rightharpoonup(\tau_1;\tau_2)}\\
\dfrac{\Gamma\vdash e_1:\to(\tau_2;\tau)\phantom{“”}\Gamma\vdash e_2:\tau_2}{\Gamma\vdash\text{ap}(e_1;e_2):\tau}\\
\dfrac{\Gamma,x:\tau\vdash e:\tau}{\Gamma\vdash\text{fix}\lbrace\tau\rbrace(x.e):\tau}}
$$
Lemma 1. If $\Gamma,x:\tau\vdash e’:\tau’$, $\Gamma\vdash e:\tau$, then $\Gamma\vdash\lbrack e/x\rbrack e’:\tau’$.
Dynamics
Judgment $e\text{ val}$ is defined by following rules
$$
\displaylines{\dfrac{}{z\text{ val}}\\
\dfrac{\lbrack e\text{ val}\rbrack}{s(e)\text{ val}}\\
\dfrac{}{\lambda\lbrace\tau\rbrace(x.e)\text{ val}}}
$$
The transition judgment $e\longmapsto e’$ is defined by following rules
$$
\displaylines{\Big\lbrack\dfrac{e\longmapsto e’}{s(e)\longmapsto s(e’)}\Big\rbrack\\
\dfrac{e\longmapsto e’}{\text{ifz}\lbrace e_0;x.e_1\rbrace(e)\longmapsto\text{ifz}\lbrace e_0;x.e_1\rbrace(e’)}\\
\dfrac{}{\text{ifz}\lbrace e_0;x.e_1\rbrace(z)\longmapsto e_0}\\
\dfrac{s(e)\text{ val}}{\text{ifz}\lbrace e_0;x.e_1\rbrace(s(e))\longmapsto\lbrack e/x\rbrack e_1}\\
\dfrac{e_1\longmapsto e_1’}{\text{ap}(e_1;e_2)\longmapsto\text{ap}(e_1’;e_2)}\\
\Big\lbrack\dfrac{e_1\text{ val}\phantom{“”}e_2\longmapsto e_2’}{\text{ap}(e_1;e_2)\longmapsto\text{ap}(e_1;e_2)}\Big\rbrack\\
\dfrac{\lbrack e_2\text{ val}\rbrack}{\text{ap}(\lambda\lbrace\tau\rbrace(x.e);e_2)\longmapsto\lbrack e_2/x\rbrack e}\\
\dfrac{}{\text{fix}\lbrace\tau\rbrace(x.e)\longmapsto\lbrack\text{fix}\lbrace\tau\rbrace(x.e)/x\rbrack e}}
$$
Safety Theorem
- If $e:\tau$ and $e\longmapsto e’$, then $e’:\tau$.
- If $e:\tau$, then either $e\text{ val}$ or $\exists e’,e\longmapsto e’$.
Call-by-Name Variant
Written $\Gamma\vdash e_1\equiv e_2:\tau$ is the strongest congruence containing axioms
$$
\displaylines{\dfrac{}{\Gamma\vdash\text{ifz}\lbrace e_0;x.e_1\rbrace(z)\equiv e_0:\tau}\\
\dfrac{}{\Gamma\vdash\text{ifz}\lbrace e_0;x.e_1\rbrace(s(e))\equiv\lbrack e/x\rbrack e_1:\tau}\\
\dfrac{}{\Gamma\vdash\text{fix}\lbrace\tau\rbrace(x.e)\equiv\lbrack\text{fix}\lbrace\tau\rbrace(x.e)/x\rbrack e:\tau}\\
\dfrac{}{\Gamma\vdash\text{ap}(\lambda\lbrace\tau_1\rbrace(x.e_2);e_1)\equiv\lbrack e_1/x\rbrack e_2:\tau}}
$$
These rules suffices to calculate the value of any closed expression of type
nat
:If $e:\text{nat}$ then $e\equiv \overline{n}:\text{nat}\iff e\longmapsto^\ast\overline{n}$.
Exercise for General Recursion
If we want to define mutual recursive functions in terms of general recursion instead of primitive functions, the definition is
$$
\begin{aligned}
e(0)&=1\\
o(0)&=0\\
e(n+1)&=o(n)\\
o(n+1)&=e(n)
\end{aligned}
$$
We need use $\text{ifx}$ and $\text{fix}$ to define function to express the mutually recursive function-duo. First we need to define the type $\tau_{eo}\triangleq\langle\text{even}\hookrightarrow(\text{nat}\to\text{nat})\shortmid\text{odd}\hookrightarrow(\text{nat}\to\text{nat})\rangle$.
According to the $\tau_{eo}$ we can have $e_{eo}\cdot\text{even}$ and $e_{eo}\cdot\text{odd}$ projection respectively.
Thus the expression $e_{eo}$ can have the general form: $\text{fix this}:\tau_{eo}\text{ is }\langle\text{even}\hookrightarrow e_{even}\shortmid\text{odd}\hookrightarrow e_{odd}\rangle$.
- $e_{even}\triangleq\lambda(x:\text{nat})\space\text{ifz }x\lbrace z\to s(z)\shortmid s(y)\to \text{this}\cdot\text{odd(y)} \rbrace$
- $e_{odd}\triangleq\lambda(x:\text{nat})\space\text{ifz }x\lbrace z\to z\shortmid s(y)\to\text{this}\cdot\text{even}(y)\rbrace$
Definability
Write $\text{fun }x\space(y:\tau_1):\tau_2\text{ is }e$ for a recursive function within whose body $e:\tau_2$ bound $y:\tau_1$ for argument and $x:\tau_1\to\tau_2$ as function itself. The dynamics follows
$$
\dfrac{}{(\text{fun }x\space(y:\tau_1):\tau_2\text{ is }e)(e_1)\longmapsto\lbrack\text{fun }x\space(y:\tau_1):\tau_2\text{ is }e,e_1/x,y\rbrack e}
$$
Recursive function in PCF are defined by $\text{fix }x:\tau_1\rightharpoonup\tau_2\text{ is }\lambda(y:\tau_1)\space e$.
For example, we can define the following recursive function to have have the $e’$ for $e’(e)=\text{rec }e\space\lbrace z\hookrightarrow e_0\shortmid s(x)\text{ with }y\hookrightarrow e_1\rbrace$.
$$
\text{fun }f\space(u:\text{nat}):\tau\text{ is }\text{ifz }u\space\lbrace z\hookrightarrow e_0\shortmid s(x)\hookrightarrow\lbrack f(x)/y\rbrack e_1\rbrace
$$
Primitive Recursion Function
Consider the case when some recursive function is definable among its domain, this kind of function is primitive recursive function.
Consider the addition $\text{add}$ operation between 2 natural numbers, then we can have a recursive definition with a base case and a recursive case following
$$
\begin{aligned}
\text{add}(x,0)&=x\\
\text{add}(x,\text{s}(y))&=\text{s}(\text{add}(x,y))
\end{aligned}
$$
In a more formal way, the primitive recursion takes $f:\mathbb{N}^n\rightharpoonup\mathbb{N}$ and $g:\mathbb{N}^{n+2}\rightharpoonup\mathbb{N}$ to define the recursion $h = \rho^n(f,g):\mathbb{N}^{n+1}\rightharpoonup\mathbb{N}$. In those case $f$ stands for base case and $g$ is for recursive case
$$
\begin{aligned}
h(x_1,\dots,x_n,0)&=f(x_1,\dots,x_n)\\
h(x_1,\dots,x_n,\text{s}(y))&=g(x_1,\dots,x_n,y,h(x_1,\dots,x_n,y))
\end{aligned}
$$
In this way the addition here can be viewed as $f:\mathbb{N}\rightharpoonup\mathbb{N}$ and $g:\mathbb{N}^3\rightharpoonup\mathbb{N}$. $g(x,y,\text{add}(x,y))\equiv\text{s}(\text{add}(x,y))$. Thus $\text{add}\triangleq\rho^1(\text{proj}^1_1,\text{succ}\circ\lbrack\text{proj}^3_3\rbrack)$.
Minimization
Minimization takes $f:\mathbb{N}^{n+1}\rightharpoonup\mathbb{N}$ and become $\mu^n f:\mathbb{N}^n\rightharpoonup\mathbb{N}$. We fix parameters $\alpha_1,\dots,\alpha_n$ and varies $x$ in the $f(\alpha_1,\dots,\alpha_n,x)$. The minimization is the smallest $x\equiv\mu^n f$ that
- $\forall x’< x$, $f(\alpha_1,\dots,\alpha_n,x’) \in\mathbb{N}$.
- $f(\alpha_1,\dots\alpha_n,x)=0$.
- If no such $x$ exists, then the $\mu^n f$ is not defined.
For example, consider the integer division that satisfy the least $q$ that $(q+1)b>a$ for $a\text{ div }b$. Introduce 2 functions here
- $\text{isZero}\triangleq\text{sub}\circ\lbrack\text{s}\circ\text{z}^1,\text{proj}^1_1\rbrack$
- $\text{lessThanEqual}\triangleq\text{isZero}\circ\text{sub}$
Then the $f:\mathbb{N}^3\rightharpoonup\mathbb{N}$ where $f$ has the definition as
$$
f(a,b,q)=\begin{cases}
1&(q+1)\cdot b\le a\\
0&(q+1)\cdot b> a
\end{cases}
$$
Thus the $f(a,b,q)\triangleq\text{lessThanEqual}\circ\lbrack\text{mult}\circ\lbrack\text{s}\circ\text{proj}^3_3,\text{proj}^3_3\rbrack,\text{proj}^3_1\rbrack$ and the $\mu^2 f:\mathbb{N}^2\rightharpoonup\mathbb{N}\triangleq \text{div}$.
In this way, we can see that minimization $\mu^2 f(1919,0)$ is undefined.
Partial Recursive Function
Partial recursive functions are defined to be primitive recursive functions extended with minimization operation. Partial function $\phi$ on natural numbers is definable in PCF $\iff$ it is partial recursive.
First we prove that minimization on natural number is definable:
Using the general fixed point operator, we iteratively test $\phi(m,n)$ on successive values of $m$, starting from 0 until $\phi(m,n)=0$. If computation result diverges, then such $m$ does not exist.
Minimization is definable in PCF, so it is at least as powerful as a set of partial recursive functions.
Conversely, we may define an evaluator for expressions of PCF as a partial recursive function, using Godel-numbering. Therefore PCF does not exceed the power of partial recursive functions.
If we weaken the minimization that $\mu^n f$ is the first $x$ makes $f(\alpha_1,\dots,\alpha_n,x)\equiv0$, and it is undefined if no such $x$ exists, if it can be still applied to previous partial recursive functions.
The difficulty is that $\mu^n f$ might be undefined for certain value $m$ prior to the $x$. The search process described might be violated.
Church’s Law
Church’s Law states that partial recursive functions coincide with the set of effectively computable functions on natural numbers. Therefore PCF is as powerful as any other language wrt the set of definable functions on natural numbers.
The universal function $\phi_{univ}$ for PCF is partial function on natural number $\phi_{univ}(\ulcorner e\urcorner)(m)=m\iff e(\overline{m})\equiv \overline{n}:\text{nat}$.
In contrast to T, the universal function is partial. Since the simulating process might fail to terminate, then the universal function is not defined for all inputs.
By Church’s Law, the universal function is definable in PCF.
- We need to define a notion of Godel-numbering that respects $\alpha$-equivalence.
- Provide an operation to build Godel number of expressions from their Godel number components and vise versa.
- Define the universal function with these primitives using a general recursion to construct the universal function.
Such function is called metacircular interpreter since it defines the interpretation of the constructs of a language in terms of those constructs themselves.
Exercise on Partial Function Halts
We might wonder that if partial function $\phi_{halts}$ such that for $e:\text{nat}\rightharpoonup\text{nat}$ then $\phi_{halts}(\ulcorner e\urcorner)$ evaluates to zero iff $e(\ulcorner e\urcorner)$ converges, and evaluates to one otherwise.
We can construct an expression $e_{diag}\triangleq\lambda(n:\text{nat})\space\text{ifz }e_{halts}(n)\lbrace z\hookrightarrow e_{diverge}\shortmid s(\underline{})\hookrightarrow z\rbrace$. Then we can see that $\phi_{halts}(\ulcorner e_{diag}\urcorner)$ can only have zero or one.
- If result is zero, then $e_{diag}(\ulcorner e_{diag}\urcorner)$ converges. By $e_{halts}(\ulcorner e_{diag}\urcorner)\equiv s(\underline{})$, then it goes contradiction.
- If result is one, then also goes contradiction as previous situation.
It seems that $\phi_{halts}$ is not definable in PCF.
Consider the previously weakened minimization. If it follows such definition and is definable, then it can solve the halting problem here.
Consider the function $\phi_{step}(\ulcorner e\urcorner,m)$ which converges iff $e(\ulcorner e\urcorner)$ converges in fewer than $m$ steps, and diverges otherwise, which follows the form of previously discussed partial recursive functions.
Finite and Infinite Data Structures
Finite data types (products or sums), including their use in pattern matching and generic programming carry over verbatim to PCF. However, the distinction between eager and lazy dynamics for these constructs become very important. Different dynamics might lead to different meaning of programs.
In eager dynamics, prod type elements are tuples of values of component types, while in lazy dynamics, they might be unevaluated, possibly divergent, computations of component types.
In infinite types like
nat
for natural numbers, in eager dynamics, is the least type containing zero and closed under successor.On the other hand,
nat
under lazy dynamics might contain value $\omega\triangleq\text{fix }x:\text{nat is }s(x)$, which has itself as predecessor, which is clearly notnat
since it is an “infinite stack of successor”.
Parallel-Or Function on Lazy PCF
In a lazy variant of PSF, a parallel-or function which enjoys the properties
$$
\begin{aligned}
e(e_1)(e_2)&\longmapsto^\ast z\text{ if }e_1\longmapsto^\ast z\\
e(e_1)(e_2)&\longmapsto^\ast z\text{ if }e_2\longmapsto^\ast z\\
e(e_1)(e_2)&\longmapsto^\ast s(z)\text{ otherwise}
\end{aligned}
$$
Parallel-or function is not definable in PCF, yet computable. Main problem is that dynamics is sequential that accepts arguments one at a time, thus cannot satisfy $\geq 1$ diverge argument.
A solution is to enrich PCF with a parallelism that interleaves the evaluation of 2 expressions, returning result of the first to diverge.
Totality and Partiality
T ensures every type checked program terminates, every function is total. But upper bound of time to terminate might be large. Though we consider it virtue, it fades away when using T to write programs.
Consider writing gcd with $(\text{nat}\times\text{nat})\rightharpoonup\text{nat}$, which suggests it may not terminate for some input, but we can prove by induction on the sum of pair arguments that it is a total function.
One way to see the problem is that in T the only form of looping is reducing a natural number by one on each recursive call. One may write more general patterns of terminating recursion with only primitive recursion, but the price is higher complexity and not necessary runtime.
We may have a practical language ruling out divergence. According to Blum Size Theorem (BST), fix any total language $\mathcal{L}$ that permits writing functions on natural numbers, pick any blow-up factor, say $2^{2^n}$. BST states that there’s a total function on natural numbers that is programmable in $\mathcal{L}$, but the shortest program in $\mathcal{L}$ is larger by a blow-up factor than its shortest program in PCF.
The underlying idea is that in a total language, proof of termination of a program is baked into the code itself, while in partial language the termination proof is an external verification.
If you leave room for ingenuity, programs can be short.