CS6620 Compiler Note 1 Lattice Basics

This part will take notes about Lattice Theory.

Appetizer: Sign analysis can be done by first construct a lattice with elements $\lbrace +,-,0,\top,\bot\rbrace$ with each parts’ meaning:

  • $+, -, 0$ stand for integer value signs
  • $\top$ stands for any integer values while $\bot$ means empty set of integer values.

$$
\begin{array}{ccccc}
& & \top& & \newline
& \swarrow & \downarrow & \searrow \newline
& + & 0 &- \newline
& \searrow & \downarrow & \swarrow \newline
& &\bot
\end{array}
$$

Lattice Basics

Partial order on a set $S$ is defined by a binary relation $\sqsubseteq$ satisfying

  • reflexivity: $\forall x \in S: x \sqsubseteq x$
  • transitivity: $\forall x,y,z\in S: x\sqsubseteq y \wedge y\sqsubseteq z \Rightarrow x \sqsubseteq z$
  • anti-symmetry: $\forall x,y\in S: x \sqsubseteq y \wedge y \sqsubseteq x\Rightarrow x =y$

A lattice is pair $(S,\sqsubseteq)$. The lower an element in a lattice network, or so to say $x \sqsubseteq y$ means $x$ is at least as precise as $y$ in $S$.

Upper Bound and Lower Bound

$X \subseteq S$ with $y \in S$ is upper bound of $X$ donated $X \sqsubseteq y$ if $\forall x \in X: x \sqsubseteq y$. A least upper bound is written $\bigsqcup X$ defined by $X \sqsubseteq \bigsqcup X\wedge \forall y \in S:X\sqsubseteq y \Rightarrow \bigsqcup X \sqsubseteq y$.

$y$ is a lower bound of $X$ donated $y \sqsubseteq X$ if $\forall x \in X: y \sqsubseteq x$. A greatest lower bound is written $\sqcap X$ defined by $\sqcap X\sqsubseteq X\wedge \forall y \in S : y \sqsubseteq X\Rightarrow y \sqsubseteq \sqcap X$.

Sometimes we donate $x \bigsqcup y$ and $\bigsqcup_{a \in A} f(a)$ to say find the least upper bound unions.

$\exists x \bigsqcup y \Rightarrow (x \sqsubseteq y \Leftrightarrow x \bigsqcup y = y)$, $\exists x \sqcap y \Rightarrow (x \sqsubseteq y \Leftrightarrow x \sqcap y = x)$.

The height of a lattice is defined to be the length path from $\bot$ to $\top$.

Lattice Constructing

Every finite set $A = \lbrace a_1,\dots, a_n\rbrace$ defines a lattice $(2^A, \subseteq)$ where $\bot = \varnothing, \top = A, x \bigsqcup y =x \bigcup y, x \sqcap y = x \cap y$. Such lattice has a height $|A|$.

A flat lattice with $flat(A)$ is shown by
$$
\begin{array}{cccccc}
& & \top\newline
& \swarrow & & \searrow\newline
a_1 & a_2 & a_3 & \dots & a_n\newline
& \searrow & & \swarrow \newline
& & \bot
\end{array}
$$

Product of lattices, where $\prod_{l\in L_1,\dots,L_n} =\lbrace (x_1, x_2, \dots, x_n) | x_i \in L_i\rbrace$ where the $\sqsubseteq$ is defined point-wise like $(x_1, x_2,\dots,x_n)\sqsubseteq (x_1’, x_2’,\dots,x_n’)\Leftrightarrow \forall i=1,\dots,n:x_i\sqsubseteq x_i’$.

In this way the $\sqcap$ and $\sqcup$ operations are also done point-wise. The height is $\sum |L_i|$.

Map lattice is defined by $A \to L = \lbrace \lbrack a_1 \mapsto x_1, a_2 \mapsto x_2,\dots\rbrack | A = \lbrace a_1,a_2,\dots\rbrace\wedge x_1, x_2,\dots \in L\rbrace$. The $f \sqsubseteq g \Leftrightarrow \forall a_i \in A: f(a_i)\sqsubseteq g(a_i)$. Height is $|A| \cdot height(L)$.

For sign analysis constraint revisit, we can donate $\left[\!\left[ v\right]\!\right]$ as a map for the sign value for all variables at the program after node $v$. Donate $\text{JOIN}(v) = \bigsqcup_{w\in pred(v)}\left[\!\left[ w\right] \!\right]$. Thus we have

  • $\left[\!\left[\text{var }x_1,\dots,x_n\right]\!\right] = \text{JOIN}(v)\left[\!\left[x_1\mapsto \top,\dots,x_n\mapsto\top\right]\!\right]$
  • $\left[\!\left[ x = E\right]\!\right] = \text{JOIN}(v) \lbrack x \mapsto \text{eval}(\text{JOIN}(v), E) \rbrack$
  • $\left[\!\left[v\right]\!\right] = \text{JOIN}(v)$ for all other nodes

Lift lattice means lift a lattice with one more $\bot$ sign. Height plus 1.

Monotonicity and Fixed-Points

Monotone function $f:L\to L$ is defined with $\forall x,y\in L: x \sqsubseteq y \Rightarrow f(x) \sqsubseteq f(y)$.

Multiple arguments function is monotone if it is monotone in each argument. Monotonicity is closed under composition, where composition has notation $f \circ g$ and $f \circ \dots \circ f = f^i$.

Kleene’s fixed-point theorem: In a lattice with finite height, every monotone function $f$ has a unique least fixed-point $fix(f) = \bigsqcup_{i \ge 0} f^i(\bot)$.

Existence is trivial.

For uniqueness, suppose $x$ is another fixed-point, then $f^i(\bot) \sqsubseteq f^i(x) = x$. By $fix(f) = f^k(\bot) \sqsubseteq f^k(x) = x$, thus $fix(f)$ is the least. By anti-symmetric, $x = fix(f)$ and uniqueness is proved.

An equation system over $L$ lattice with form $\begin{cases} x_1 &=&f_1(x_1,\dots,x_n) \newline x_2 &=&f_2(x_1,\dots,x_n) \newline \vdots \newline x_n &=&f_n(x_1,\dots,x_n)\end{cases}$ where $f_i: L^n \to L$. A solution to such equation system provides $L^n$ value that satisfies all equations.

Rewrite functions in one with $f:L^n\to L^n, f(x_1,\dots,x_n) = (f_1(x_1,\dots,x_n),\dots,f_n(x_1,\dots,x_n))$, which looks like $x = f(x)$ with $x \in L^n$.

The solving process of such equation system can be done like:

  • $\begin{cases} x_1 &\sqsubseteq &f_1(x_1,\dots,x_n) \newline \vdots \newline x_n &\sqsubseteq &f_n(x_1,\dots,x_n)\end{cases}$ has predicate of fixed-point existence, thus solution satisfies $x_i = x_i \sqcap f_i(x_1,\dots,x_n)$. Rewrite into form $\begin{cases} x_1 &=& x_1 \sqcap f_1(x_1,\dots,x_n) \newline \vdots \newline x_n &= &x_n \sqcap f_n(x_1,\dots,x_n)\end{cases}$.
  • $\begin{cases} x_1 &\sqsupseteq &f_1(x_1,\dots,x_n) \newline \vdots \newline x_n &\sqsupseteq &f_n(x_1,\dots,x_n)\end{cases}$ can be rewritten into $\begin{cases} x_1 &= &x_1\bigsqcup f_1(x_1,\dots,x_n) \newline \vdots \newline x_n &= & x_n \bigsqcup f_n(x_1,\dots,x_n)\end{cases}$.

The lattice points as answers can be shown below. The most trivial answer is on top and undefined/null is at bottom.

graph TD;
subgraph Safe Answers
A1["Trivial Useless Answer (Top)"] --- A2["Our Answer (Least Fixpoint)"]
A2 .-> A3["The true answer"]
A2 --- A4["."]
end
subgraph Unsafe Answers
A4 --- A5["."]
A5 --- A6["Bottom"]
end

Monotone Framework

We want to extract all constraints (like sign analysis) from a CFG. For each node $v$ in CFG we can assign a constraint variable $\lbrack \! \lbrack v \rbrack\!\rbrack$ ranging over elements of lattice.

If all constraints for the given program happen to be equations or inequations with monotone RHS, then fixed-point algorithms come into place to find unique least solution.

The combination of lattice and a space of monotone functions is called monotone framework.

For naive/round-robin fixed-point algorithm, an improved version called work-list algorithm can be used here from chaotic iteration.

An map $\text{dep}:\text{Nodes}\to 2^\text{Nodes}$ is used for $v$ in CFG to tell us the subset of other node where $\lbrack \! \lbrack v \rbrack\!\rbrack$ occurs in a nontrivial way on RHS of dataflow equation.

Thus work-list algorithm is shown as
$$
\begin{aligned}
&x_1\leftarrow \bot; \dots; x_n\leftarrow \bot;\newline
&W \leftarrow \lbrace v_1,\dots,v_n\rbrace;\newline
&\text{while }(W \neq \varnothing)\newline
&\phantom{“”””}v_i\leftarrow W.\text{removeNext}()\newline
&\phantom{“”””}y\leftarrow f_i(x_1,\dots,x_n)\newline
&\phantom{“”””}\text{if }y \neq x_i\newline
&\phantom{“”””””””}x_i \leftarrow y\newline
&\phantom{“”””””””}\text{for }v_j \in dep(v_i)\newline
&\phantom{“”””””””””””}W.\text{add}(v_j)\newline
&\phantom{“”””””””}\text{end for}\newline
&\phantom{“”””}\text{end if}\newline
&\text{end while}\newline
&\text{return }(x_1,\dots,x_n)
\end{aligned}
$$

The complexity of the algorithm is explained in next chapter.