PFPL Chapter 1 Abstract Syntax Note

Posted on 2018-02-20 In PFPL Note Disqus:

Please refer this link from page 3 to 11.

Informal Concept of Syntax

Surface, Concrete Syntax

How phrase are entered and displayed on computer.

Usually thought of as given by strings of characters from some alphabet.

Structured, Abstract Syntax

The structure of phrases, specifically how they are composed from other phrases.

Finite Tree Structure of Syntax

We consider pieces of syntax to be finite trees augmented with a means of expressing binding and scope of identifiers within syntax tree.

We define “a piece of syntax“ in 2 stages:

The definition of abstract syntax tree.
Augment abstract syntax tree with means of specifying:
- binding (declaration)
- scope (range of significance)
of an identifier.

Abstract Syntax Tree (AST)

An ordered tree whose:

leaves are variables
interior nodes are operators whose arguments are its children.

Sorts

AST’s are classified into a variety of sorts corresponding to different forms of syntax.

Variable

A variable stands for an unspecified, or generic, piece of syntax of a specified sort.

Operator

Operator has an arity specifying:

The sort of the operator.
The number and sorts of its arguments.

e.g. $<s_1,\dots, s_n>s$ as an arity for the operator.

Structural Induction

A method when we want to prove some property $P(a)$ holds for all $a$ AST of a given sort.

Consider all the methods in which AST $a$ of this sort can be generated.
Assume $P(t)$ holds for all of $a$’s constituent AST’s.
We prove that the $P(a)$ holds in each case under the previous assumption.

AST Precise Definitions

$S$ is a finite set of sorts.
An arity has the form $(s_1,\dots,s_n)s$, with $\forall s, s_i \in S$ and $n \geq 0$.
$O =\lbrace O_\alpha \rbrace$ is an arity-indexed family of disjoint sets of operators $O_\alpha$ of arity $\alpha$.
$X = \lbrace X_s\rbrace_{s\in S}$ is a sort-indexed family of disjoint finite sets $X_s$ of variables $x$ of sort $s$.
- If $x \in X_s$, we say $x$ is of sort $s$.
- If $x \notin X_s$ and $x$ is of sort $s$, then we say $x$ is fresh for $X$.
- If $x$ is fresh for $X$ and $s$ is a sort, then $X, x$ is a family of sets of variables by adding $x$ to $X_s$.
$A[X]$ is a family of abstract syntax trees with the form of $A[X] = \lbrace A[X] \rbrace_{s\in S}$ with each sort $s$ of $A[X]_s$ is the smallest family satisfying
- If $x \in X_s$ then $x \in A[X]_s$ $\Rightarrow$ A variable of sort $s$ is an AST of sort $s$.
- If $o$ is of arity $(s_1, \dots, s_n)s$ and if $a_i \in A[X]_{s_i}$, then $o(a_1, \dots, a_n) \in A[X]_s$ $\Rightarrow$ Operators combines AST’s.
Structural Induction: to show $P(a)$ holds $\forall a \in A[X]$
- If $x \in X_s$, then $P_s(x)$.
- If $o$ is of arity $(s_1, \dots, s_n)s$ and $P_{s_i}(a_i)$, then $P_s(o(a_1, \dots, a_n))$.
Substitution gives variables meanings.

If $a \in A[X, x]_{s’}$ and $b \in A[X]_s$, then $\lbrack b / x \rbrack a \in A \lbrack X \rbrack _{s’}$.
- $\lbrack b / x \rbrack x = b$ and $\lbrack b / x \rbrack y = y$ if $x \neq y$.
- $\lbrack b / x \rbrack o(a_1, \dots, a_n) = o(\lbrack b / x\rbrack a_1, \dots, \lbrack b / x\rbrack a_n)$.
Theorem: $a \in A\lbrack X, x\rbrack$, then $\forall b \in A\lbrack X \rbrack$, there exists unique $c \in A[X]$ such that $\lbrack b / x \rbrack a = c$.

Abstract Binding Tree (ABT)

Binding and Scope

Binding is a method to introduce new variables and symbols to enrich ast’s to abt’s.

Binding has specified range of significance, called scope.

The scope of a binding is an abt within which the bound identifier can be used.

Example: $let$ $x$ $be$ $a_1$ $in$ $a_2$.

The variable $x$ is introduced within expression $a_2$ to stand for $a_1$.

$x$ is bounded by the $let$ expression within $a_2$.

Any use of $x$ within $a_1$ refers to a different variable that happens to have the same time.

Names of Bound Variables

The name of bound variables are not so important to the extent that they determine the same binding.

Abstractor

ABT’s generalize AST’s by allowing an operator to bind any finite number of variables (including zero) in each argument.

Argument to an operator is called an abstractor with the form $x_1, \dots, x_k.a$.

The sequence of variables $x_1, \dots, x_k$ are bound within the abt $a$.

Example: $let$ $x$ $be$ $a_1$ $in$ $a_2$ can be written in the form of $let(a_1; x.a_2)$.

Some Shorthand Notations

$\vec{x}$ often stands for $x_1,\dots,x_n$ and $\vec{x}.a$ for $x_1, \dots, x_k.a$.

Generalized Arity and Valence

Operators are assigned generalized arities to account for binding with the form $(v_1, \dots, v_n)s$.

It specifies operator of sort $s$ with $n$ arguments of valence $v_1, \dots, v_n$.

In general, a valence has form $s_1, \dots, s_k .s$ specifies the sort of an argument as well the number and sorts of the variables bound within it.

A sequence $\vec{x}$ of variables is of sort $\vec{s}$.

Example: $let$ is of arity (Exp, Exp.Exp) Exp.

It has 2 arguments:

The first one is an Exp.

The second one is an abstractor that binds one Exp.

ABT Precise Definitions

Fix a set $S$ of sorts.
$O =\lbrace O_\alpha \rbrace$ is an arity-indexed family of disjoint sets of operators $O_\alpha$ of generalized arity $\alpha$.
For a given family of disjoint sets of variables $X$, abt’s $B[X]$’s definition is similar, except for the $X$ is not fixed since it may change as we enter scopes of abstractors.
If $x \in X_s$, $x \in B\lbrack X\rbrack_s$.
$\forall o$ with arity $(\vec{s_1}.s_1, \dots, \vec{s_n}.s_n)s$, if $\forall 1 \leq i \leq n$ $a_i \in B[X, \vec{x_i}]_{s_i}$, then $o(\vec{x_1}.a_1, \dots, \vec{x_n}.a_n) \in B[X]_s$.

Some Flaws and Counter Examples

Check this: $let(a_1; x.let(a_2; x.a_3))$

It is ill-formed since the first binding already add $x$ into $X$.

Then the second binding cannot add $x$ into $X, x$ since it is not fresh for $X, x$.

The flaw comes to renaming of bound variables.

Solution: Fresh Renaming

Ensure that each of the arguments is well-formed regardless of the choice of the bound variable names.

Fresh renaming (with related to $X$) of a finite sequence of variables $\vec{x}$ is a bijection $\rho: \vec{x} \leftrightarrow \vec{x}’$ between $\vec{x}$ and $\vec{x}’$ where $\vec{x}’$ is fresh for $X$.

Notation: $\hat{\rho}(a)$ for the result for replacing each occurrence of $x_i$ in $a$ by $\rho(x_i)$, its fresh counterpart.

Some Minor Changes

$\forall o$ with arity $(\vec{s_1}.s_1, \dots, \vec{s_n}.s_n)s$, if $\forall 1 \leq i \leq n$ and each fresh renaming $\rho_i: \vec{x_i}\leftrightarrow \vec{x_i}’$, we have $\hat{\rho_i}(a_i) \in B[X,\vec{x_i}’]$, then $o(\vec{x_1}.a_1, \dots, \vec{x_n}.a_n) \in B[X]_s$.

Structural Induction Modulo Fresh Renaming

If we want to show that $P\lbrack X\rbrack (a)$ holds for every $a \in B[X]$, it is necessary to show:

If $x \in X$, then $P[X]_s(x)$.
$\forall o$ of arity $(\vec{s_1}.s_1,\dots,\vec{s_n}.s_n)s$, if $\forall 1 \le i \le n$, $P[X, \vec{x_i}’]_{s_i}(\hat{\rho_i}(a_i))$ holds for every $\hat{\rho_i}:\vec{x_i} \leftrightarrow \vec{x_i}’$ with $\vec{x_i}’\notin X$, then $P[X]_s(o(\vec{x_1}.a_1, \dots, \vec{x_n}.a_n))$.

Example: $x \in a$ where $a \in B [X, x]$, which means $x$ occurs free in $a$.

Informally, this means $x$ is bound outside of $a$.

So we should apply the structural induction modulo fresh renaming:

$x \in x$

$x \in o(\vec{x_1}.a_1, \dots, \vec{x_n}.a_n)$ if $\forall 1 \le i \le n$ such that for every $\rho:\vec{x_i}\leftrightarrow\vec{z_i}$ we have $x \in\hat{\rho}(a_i)$.

$\alpha$-Equivalence and Substitution

The relation $a =_\alpha b$ of $\alpha$-equivalence means $a$ and $b$ are identical up to the choice of bound variable names.

$x = _\alpha x$
$o(\vec{x_1}.a_1,\dots,\vec{x_n}.a_n) = _\alpha o(\vec{x_1}’.a_1’,\dots,\vec{x_n}’.a_n’)$ if $\forall 1 \le i \le n, \hat{\rho_i}(a_i) =_\alpha \hat{\rho_i’}(a_i’)$, $\forall \rho_i:\vec{x_i}\leftrightarrow\vec{z_i}$ and $\forall \rho_i’:\vec{x_i}’\leftrightarrow\vec{z_i}$.

Substitution is defined carefully with the following:

$\lbrack b / x \rbrack x = b$ and $\lbrack b / x \rbrack y = y$ if $x \ne y$.
$\lbrack b / x \rbrack o (\vec{x_1}.a_1,\dots, \vec{x_n}.a_n) = o(\vec{x_1}.a_1’, \dots, \vec{x_n}.a_n’)$ where $\forall 1 \le i \le n, \vec{x_i} \notin b$ and:
- $a_i’ = \lbrack b / x \rbrack a_i$ if $x \notin \vec{x_i}$
- $a_i’ = a_i$ if $x \in \vec{x_i}$

Example to explain:

$\lbrack b / x \rbrack let(a_1;x.a_2) = let(\lbrack b / x \rbrack a_1; x.a_2)$

If $x$ is bound by an abstractor within $a$, then $x$ doesn’t occur free in $a$, then it cannot be changed by substitution.

In this example, $x$ is not occur free in $x.a_2$.

If $y \in b$ and $x \ne y$, then $\lbrack b / x \rbrack let(a_1; y.a_2)$ is undefined.

More specifically, $\lbrack b / x \rbrack let(num[0], y.plus(x;y))$ is undefined.

Since you are trying to let an unbounded variance be substituted and confuses two different variables called $y$.

This is called capture of a free variable during substitution.

Some little note about capture avoidance:

Although it is an essential characteristic of substitution, it is just merely a technical nuisance.

If the names of bound variables have no significance, then it can be avoided by first renaming the bound variables in $a$ to avoid the free variable names in $b$.

Example: $\lbrack b / x \rbrack let(num[0], y.plus(x;y))$ with $y \in b$ and $x \ne y$.

We can set $a’ \triangleq let(num[0], y’.plus(x;y’))$ then $\lbrack b / x \rbrack a’$ is defined.

Identification Convention in ABT

ABT’s are always identified up to $\alpha$-equivalence.

Explain: Substitution can be extended to $\alpha$-equivalence classes of abt’s:

$\rightarrow$ Choosing equivalence classes of $b$ and $a$ in the way of substitution is defined.

$\rightarrow$ Then we get the $\alpha$-equivalence results.

So substitution becomes a well-defined total function.

Operators and Symbols

It is necessary to consider language whose abstract syntax cannot be specified by a fixed set of operators, but rather requires the available operators be sensitive to the context in which they occur.

We introduce a set of symbols that index families of operators.

An indexed operator $o$ is a family of operators indexed by symbols $u$, so $o\lbrack u\rbrack$ is an operator when $u$ is available symbol.

If $U$ is a finite set of symbols, then $B[U; X]$ is the sorted-index family of abt’s generated by operators and variables, admitting all index operator instances by symbols $u \in U$.

The only significance of a symbol is whether it is the same as or differs from another symbol, it doesn’t stand for anything.

Introducing New Symbols

The set of symbols is extended by introducing a new or fresh symbol within a scope using the abstractor $u.a$, binds the symbol $u$ within abt $a$.

The name of the bound symbol can be varied at will, provided that no conflicts arise.

Symbol and Variable Difference

The only operation on symbols is renaming, there’s no notion of substitution for a symbol.

Some Notation For Readability

We often group and stage the arguments to an operator:

using round brackets and braces to show grouping
regarding stages to progress from left to right
All arguments in a group are considered to occur at the same stage.
the successive groups are considered to occur in sequential stages.

Staging and grouping are helpful but not fundamental significant:

$o\lbrace a_1;a_2 \rbrace (a_3;x.a_4)$ is the same as $o(a_1; a_2; a_3;x.a_4)$, as would be any other order-preserving grouping or staging of its arguments.

Exercises For Review

1.1

Prove by structural induction on ast that $X \subseteq Y$, then $A\lbrack X \rbrack \subseteq A\lbrack Y \rbrack$.

$\forall v \in X$, $v \in Y$, thus $\forall v \in A\lbrack X\rbrack$, $v \in A\lbrack Y \rbrack$.
$\forall v_i \in X$, $v_i \in Y$, thus $\forall o(v_1, \dots, v_n) \in A \lbrack X\rbrack$, $\forall o(v_1, \dots, v_n) \in A \lbrack Y\rbrack$.

1.2

Prove by structural induction on abt that $X \subseteq Y$, then $B\lbrack X \rbrack \subseteq B\lbrack Y \rbrack$.

First part is the same as ast’s. Jump to second part.

If $\vec{v}.a \in B\lbrack X\rbrack$, then we need to show that $\vec{v}.a\in B\lbrack Y\rbrack$.

Then $\rho:\vec{v} \leftrightarrow \vec{v}’$, we get $\vec{v} \in X$ but $\vec{v}’ \notin X$.

Thus $\hat{\rho}(a) \in B[X, \vec{v}’]$ and it leads to $\hat{\rho}(a) \in B[Y, \vec{v}’]$.

1.3

Provided that:

$a = _\alpha a’$
$b =_\alpha b’$

Show that:

$\lbrack b / x \rbrack a$ and $\lbrack b’/x\rbrack a’$ are defined, then $\lbrack b / x \rbrack a =_\alpha \lbrack b’/x\rbrack a’$.

WLOG, we say $\lbrack b / x \rbrack a = _\alpha \lbrack b / x \rbrack a’$ by identification convention of abt.

Then since $b =_\alpha b’$, we can substitute / rename $b$ into $b’$ on RHS. Thus we get the result.

1.4

Problem is toooooooooooo long, check the link.

We set a bound variable occurrence represented by $bv[i]$, $i \in \mathbb{Z}^+$ for $i^{th}$ enclosing abstractor.

An abstractor for abg has a notation $.g$ with $”.”$ for introduction of an (unnamed) bound variable.

Index abg‘s with $n$ enclosing abstractors by notation $G\lbrack X\rbrack_n$.

If $g \in G\lbrack X\rbrack_n$ then $.g\in G\lbrack X\rbrack_{n+1}$.

Also $\forall 1 \le i \le n$, $bv\lbrack i \rbrack \in G\lbrack X\rbrack _n$.