CS6620 Compiler Note 2 Data Flow Analysis with Lattice

More flow sensitive analysis (data flow analysis) on the way

Forward Analysis Backward Analysis
May Analysis Reaching Definition Liveness
Must Analysis Available Expressions Very Busy Expressions

Liveness Analysis

Live for a variable at a program point means its current value may be read later in remaining execution without being written to in between. The analysis only answer “dead” if the variable is really dead.

The Lattice we set up is a parameterized lattice, which depends on specific program being analyzed. It should be established as $(2^{\lbrace vars \rbrace}, \subseteq)$.

Each CFG node $v$ a constraint variable $\lbrack \lbrack v\rbrack\rbrack$ is introduced donating the subset of program variables that live at program point before the node. In this way the total logic of this analysis is like
$$
\lbrack \lbrack v\rbrack \rbrack = \bigcup_{v_{next} \in succ(v)} \lbrack \lbrack v_{next}\rbrack \rbrack \cup \Delta v_{introduce}
$$

$\lbrack\lbrack v\rbrack\rbrack$ describes state before $v$.

The definition of $JOIN$ follows $JOIN(v) = \bigcup_{w\in succ(v)} \lbrack \lbrack w \rbrack\rbrack$. The constraint rules follows
$$
\begin{aligned}
x = E &\phantom{“”} \lbrack\lbrack v \rbrack\rbrack &=& \bigcup_{v_{next} \in succ(v)} \lbrack \lbrack v_{next}\rbrack \rbrack \backslash \lbrace X\rbrace \cup vars(E)\newline
\begin{array}{l}
\text{if }E\newline
\text{while }E&\newline
\text{output }E&
\end{array}
&\phantom{“”} \lbrack\lbrack v\rbrack \rbrack &=& \bigcup_{v_{next} \in succ(v)} \lbrack \lbrack v_{next}\rbrack \rbrack \cup vars(E)\newline
\text{var }X_1,\dots, X_n&\phantom{“”}\lbrack\lbrack v\rbrack\rbrack &=& \bigcup_{v_{next} \in succ(v)} \lbrack \lbrack v_{next}\rbrack \rbrack \backslash \lbrace X_1,\dots,X_n\rbrace\newline
&\phantom{“”}\lbrack\lbrack exit\rbrack\rbrack &=& \varnothing\newline
&\phantom{“”}\lbrack\lbrack v\rbrack \rbrack &=& \bigcup_{v_{next} \in succ(v)} \lbrack \lbrack v_{next}\rbrack \rbrack
\end{aligned}
$$
Each iteration of such constraint rules on CFG, there will be $k$ variables and $n$ CFG nodes. $k-$nodes set would take $O(k)$ in union/remove operation, and $O(n)$ for iteration node. Total $O(nk)$ per iteration.

The total iteration number until it stops would be $O(nk)$ considering work-list algorithm last chapter. The worst case would be each node goes up to top of lattice. Thus the total time complexity is $O(n^2k^2)$.

Available Expressions Analysis

A nontrivial expression is available at a program point if its current value has already been computed.

Since the approximation is generally including too few expressions, the analysis can only say available if it is definitely available.

The lattice $L = (2^{expr}, \supseteq)$ have $JOIN(n) = \bigcap_{w\in pred(v)}\lbrack \lbrack w\rbrack \rbrack$ (since $\sqsubseteq = \supseteq$). The lattice is upside down with $\varnothing$ on the top.

$\lbrack\lbrack v\rbrack\rbrack$ describes state after $v$.

Since availability of expression depends on information from the past, also the partial order defined here, JOIN has such definition.

$$
\begin{aligned}
X = E &\phantom{:::}\lbrack \lbrack v\rbrack\rbrack &=& (\bigcap_{w\in pred(v)}\lbrack \lbrack w\rbrack \rbrack \cup exps(E)) \downarrow X\newline
& & & exps(X)= \varnothing\newline
& & & exps(I) = \varnothing\newline
& & & exps(\text{input}) = \varnothing\newline
& & & exps(E_1\ op\ E_2) = \lbrace E_1\ op\ E_2\rbrace \cup exps(E_1) \cup exps(E_2)\newline
&\phantom{:::}\lbrack\lbrack entry\rbrack\rbrack & =& \varnothing\newline
\begin{array}{l} \text{if } E \newline \text{while }E\newline \text{output }E \end{array} & \phantom{:::}\lbrack\lbrack v\rbrack\rbrack &=& \bigcap_{w\in pred(v)}\lbrack \lbrack w\rbrack \rbrack \cup exps(E)\newline
&\phantom{:::}\lbrack\lbrack v\rbrack\rbrack & = & \bigcap_{w\in pred(v)}\lbrack \lbrack w\rbrack \rbrack
\end{aligned}
$$

Very Busy Expression Analysis

A nontrivial expression is very busy if it will definitely be evaluated again before its value changes.

An expression can be busy if it is evaluated in current node or will be evaluated unless some assignment changes its value.

Since the approximation is generally including too few expressions, the analysis can only say busy if it is definitely busy. The lattice upside down is arranged like available expression’s.

For every CFG node $v$ variable $\lbrack\lbrack v\rbrack\rbrack$ donates the set of expressions that at the program point before $v$ definitely are busy. The lattice definition is the same as available expression analysis.

$\lbrack\lbrack v\rbrack\rbrack$ describes state before $v$.

$$
\begin{aligned}
& & & JOIN(v) = \bigcap_{w\in succ(v)}\lbrack \lbrack w\rbrack \rbrack \newline
X = E &\phantom{:::} \lbrack\lbrack v\rbrack\rbrack &=& \bigcap_{w\in succ(v)}\lbrack \lbrack w\rbrack \rbrack \downarrow X \cup exps(E)\newline
\begin{array}{l} \text{if } E \newline \text{while }E\newline \text{output }E \end{array} & \phantom{:::}\lbrack\lbrack v\rbrack\rbrack &=& \bigcap_{w\in succ(v)}\lbrack \lbrack w\rbrack \rbrack \cup exps(E)\newline
& \phantom{:::}\lbrack\lbrack v\rbrack\rbrack &=& \bigcap_{w\in succ(v)}\lbrack \lbrack w\rbrack \rbrack\newline
& \lbrack \lbrack exit\rbrack\rbrack &=& \varnothing
\end{aligned}
$$

Reach Definitions Analysis

Reaching definitions for a given program point are those assignments that may define the current value of variables.

$\lbrack\lbrack v\rbrack\rbrack$ describes state after $v$.

The lattice is modeled by $(2^{assignments}, \subseteq)$ and $JOIN(v) = \bigcup_{w\in pred(v)} \lbrack\lbrack w \rbrack\rbrack$. The rules follows
$$
\begin{aligned}
X = E &\phantom{:::} \lbrack\lbrack v\rbrack\rbrack &=& \bigcup_{w\in pred(v)} \lbrack\lbrack w \rbrack\rbrack \downarrow X \cup \lbrace X = E \rbrace\newline
&\phantom{:::}\lbrack \lbrack v\rbrack \rbrack &=&\bigcup_{w\in pred(v)} \lbrack\lbrack w \rbrack\rbrack
\end{aligned}
$$

Forward, Backward. May, Must

Forward analysis computes about past behaviors. E.g., Available Analysis, Very Busy Expression Analysis

Backward analysis computes about future behaviors. E.g., Living Analysis, Reach Definition Analysis

May analysis means over-approximation, talking about possibly true info. E.g., Living Analysis, Reach Definitions Analysis

Must analysis means under-approximation, talking about definite true info. E.g., Available Analysis, Very Busy Expression Analysis