CS6620 Compiler Note 1 Lattice Basics
This part will take notes about Lattice Theory.
Appetizer: Sign analysis can be done by first construct a lattice with elements $\lbrace +,-,0,\top,\bot\rbrace$ with each parts’ meaning:
- $+, -, 0$ stand for integer value signs
- $\top$ stands for any integer values while $\bot$ means empty set of integer values.
$$
\begin{array}{ccccc}
& & \top& & \newline
& \swarrow & \downarrow & \searrow \newline
& + & 0 &- \newline
& \searrow & \downarrow & \swarrow \newline
& &\bot
\end{array}
$$
Lattice Basics
Partial order on a set $S$ is defined by a binary relation $\sqsubseteq$ satisfying
- reflexivity: $\forall x \in S: x \sqsubseteq x$
- transitivity: $\forall x,y,z\in S: x\sqsubseteq y \wedge y\sqsubseteq z \Rightarrow x \sqsubseteq z$
- anti-symmetry: $\forall x,y\in S: x \sqsubseteq y \wedge y \sqsubseteq x\Rightarrow x =y$
A lattice is pair $(S,\sqsubseteq)$. The lower an element in a lattice network, or so to say $x \sqsubseteq y$ means $x$ is at least as precise as $y$ in $S$.
Upper Bound and Lower Bound
$X \subseteq S$ with $y \in S$ is upper bound of $X$ donated $X \sqsubseteq y$ if $\forall x \in X: x \sqsubseteq y$. A least upper bound is written $\bigsqcup X$ defined by $X \sqsubseteq \bigsqcup X\wedge \forall y \in S:X\sqsubseteq y \Rightarrow \bigsqcup X \sqsubseteq y$.
$y$ is a lower bound of $X$ donated $y \sqsubseteq X$ if $\forall x \in X: y \sqsubseteq x$. A greatest lower bound is written $\sqcap X$ defined by $\sqcap X\sqsubseteq X\wedge \forall y \in S : y \sqsubseteq X\Rightarrow y \sqsubseteq \sqcap X$.
Sometimes we donate $x \bigsqcup y$ and $\bigsqcup_{a \in A} f(a)$ to say find the least upper bound unions.
$\exists x \bigsqcup y \Rightarrow (x \sqsubseteq y \Leftrightarrow x \bigsqcup y = y)$, $\exists x \sqcap y \Rightarrow (x \sqsubseteq y \Leftrightarrow x \sqcap y = x)$.
The height of a lattice is defined to be the length path from $\bot$ to $\top$.
Lattice Constructing
Every finite set $A = \lbrace a_1,\dots, a_n\rbrace$ defines a lattice $(2^A, \subseteq)$ where $\bot = \varnothing, \top = A, x \bigsqcup y =x \bigcup y, x \sqcap y = x \cap y$. Such lattice has a height $|A|$.
A flat lattice with $flat(A)$ is shown by
$$
\begin{array}{cccccc}
& & \top\newline
& \swarrow & & \searrow\newline
a_1 & a_2 & a_3 & \dots & a_n\newline
& \searrow & & \swarrow \newline
& & \bot
\end{array}
$$
Product of lattices, where $\prod_{l\in L_1,\dots,L_n} =\lbrace (x_1, x_2, \dots, x_n) | x_i \in L_i\rbrace$ where the $\sqsubseteq$ is defined point-wise like $(x_1, x_2,\dots,x_n)\sqsubseteq (x_1’, x_2’,\dots,x_n’)\Leftrightarrow \forall i=1,\dots,n:x_i\sqsubseteq x_i’$.
In this way the $\sqcap$ and $\sqcup$ operations are also done point-wise. The height is $\sum |L_i|$.
Map lattice is defined by $A \to L = \lbrace \lbrack a_1 \mapsto x_1, a_2 \mapsto x_2,\dots\rbrack | A = \lbrace a_1,a_2,\dots\rbrace\wedge x_1, x_2,\dots \in L\rbrace$. The $f \sqsubseteq g \Leftrightarrow \forall a_i \in A: f(a_i)\sqsubseteq g(a_i)$. Height is $|A| \cdot height(L)$.
For sign analysis constraint revisit, we can donate $\left[\!\left[ v\right]\!\right]$ as a map for the sign value for all variables at the program after node $v$. Donate $\text{JOIN}(v) = \bigsqcup_{w\in pred(v)}\left[\!\left[ w\right] \!\right]$. Thus we have
- $\left[\!\left[\text{var }x_1,\dots,x_n\right]\!\right] = \text{JOIN}(v)\left[\!\left[x_1\mapsto \top,\dots,x_n\mapsto\top\right]\!\right]$
- $\left[\!\left[ x = E\right]\!\right] = \text{JOIN}(v) \lbrack x \mapsto \text{eval}(\text{JOIN}(v), E) \rbrack$
- $\left[\!\left[v\right]\!\right] = \text{JOIN}(v)$ for all other nodes
Lift lattice means lift a lattice with one more $\bot$ sign. Height plus 1.
Monotonicity and Fixed-Points
Monotone function $f:L\to L$ is defined with $\forall x,y\in L: x \sqsubseteq y \Rightarrow f(x) \sqsubseteq f(y)$.
Multiple arguments function is monotone if it is monotone in each argument. Monotonicity is closed under composition, where composition has notation $f \circ g$ and $f \circ \dots \circ f = f^i$.
Kleene’s fixed-point theorem: In a lattice with finite height, every monotone function $f$ has a unique least fixed-point $fix(f) = \bigsqcup_{i \ge 0} f^i(\bot)$.
Existence is trivial.
For uniqueness, suppose $x$ is another fixed-point, then $f^i(\bot) \sqsubseteq f^i(x) = x$. By $fix(f) = f^k(\bot) \sqsubseteq f^k(x) = x$, thus $fix(f)$ is the least. By anti-symmetric, $x = fix(f)$ and uniqueness is proved.
An equation system over $L$ lattice with form $\begin{cases} x_1 &=&f_1(x_1,\dots,x_n) \newline x_2 &=&f_2(x_1,\dots,x_n) \newline \vdots \newline x_n &=&f_n(x_1,\dots,x_n)\end{cases}$ where $f_i: L^n \to L$. A solution to such equation system provides $L^n$ value that satisfies all equations.
Rewrite functions in one with $f:L^n\to L^n, f(x_1,\dots,x_n) = (f_1(x_1,\dots,x_n),\dots,f_n(x_1,\dots,x_n))$, which looks like $x = f(x)$ with $x \in L^n$.
The solving process of such equation system can be done like:
- $\begin{cases} x_1 &\sqsubseteq &f_1(x_1,\dots,x_n) \newline \vdots \newline x_n &\sqsubseteq &f_n(x_1,\dots,x_n)\end{cases}$ has predicate of fixed-point existence, thus solution satisfies $x_i = x_i \sqcap f_i(x_1,\dots,x_n)$. Rewrite into form $\begin{cases} x_1 &=& x_1 \sqcap f_1(x_1,\dots,x_n) \newline \vdots \newline x_n &= &x_n \sqcap f_n(x_1,\dots,x_n)\end{cases}$.
- $\begin{cases} x_1 &\sqsupseteq &f_1(x_1,\dots,x_n) \newline \vdots \newline x_n &\sqsupseteq &f_n(x_1,\dots,x_n)\end{cases}$ can be rewritten into $\begin{cases} x_1 &= &x_1\bigsqcup f_1(x_1,\dots,x_n) \newline \vdots \newline x_n &= & x_n \bigsqcup f_n(x_1,\dots,x_n)\end{cases}$.
The lattice points as answers can be shown below. The most trivial answer is on top and undefined/null is at bottom.
graph TD; subgraph Safe Answers A1["Trivial Useless Answer (Top)"] --- A2["Our Answer (Least Fixpoint)"] A2 .-> A3["The true answer"] A2 --- A4["."] end subgraph Unsafe Answers A4 --- A5["."] A5 --- A6["Bottom"] end
Monotone Framework
We want to extract all constraints (like sign analysis) from a CFG. For each node $v$ in CFG we can assign a constraint variable $\lbrack \! \lbrack v \rbrack\!\rbrack$ ranging over elements of lattice.
If all constraints for the given program happen to be equations or inequations with monotone RHS, then fixed-point algorithms come into place to find unique least solution.
The combination of lattice and a space of monotone functions is called monotone framework.
For naive/round-robin fixed-point algorithm, an improved version called work-list algorithm can be used here from chaotic iteration.
An map $\text{dep}:\text{Nodes}\to 2^\text{Nodes}$ is used for $v$ in CFG to tell us the subset of other node where $\lbrack \! \lbrack v \rbrack\!\rbrack$ occurs in a nontrivial way on RHS of dataflow equation.
Thus work-list algorithm is shown as
$$
\begin{aligned}
&x_1\leftarrow \bot; \dots; x_n\leftarrow \bot;\newline
&W \leftarrow \lbrace v_1,\dots,v_n\rbrace;\newline
&\text{while }(W \neq \varnothing)\newline
&\phantom{“”””}v_i\leftarrow W.\text{removeNext}()\newline
&\phantom{“”””}y\leftarrow f_i(x_1,\dots,x_n)\newline
&\phantom{“”””}\text{if }y \neq x_i\newline
&\phantom{“”””””””}x_i \leftarrow y\newline
&\phantom{“”””””””}\text{for }v_j \in dep(v_i)\newline
&\phantom{“”””””””””””}W.\text{add}(v_j)\newline
&\phantom{“”””””””}\text{end for}\newline
&\phantom{“”””}\text{end if}\newline
&\text{end while}\newline
&\text{return }(x_1,\dots,x_n)
\end{aligned}
$$
The complexity of the algorithm is explained in next chapter.