Zero Knowledge Proof on PCP and SNARK

Posted on 2020-09-10 In Course Note Disqus:

Just trying to take notes about R1CS/QAP/SSP/PCP and sort out things about ZKP and SNARK.

Partly adopted from QAP from Zero to Hero by Vitalik Buterin and ZK Proof Note.

graph LR
P(Problem) -->|Flatten| C(Circuit)
subgraph Problem Translate
C -->|Synthesize| R(R1CS)
R -->|FFT| Q(QAP)
end
subgraph Verification
Q -->|Create Proof| Pr(Proof)
Q -->|Setup| Pa(Params)
Pa -->|Create Proof| Pr
Pr -->|Verify Proof| V(Verify)
Pa -->|Verify Proof| V
end

SNARK and PCP

A SNARK (Succinct Non-Interactive Argument of Knowledge) system has 3 components:

Setup: $\text{Setup}(C) \to (S_p, S_v)$
Proof: $\text{Proof}(S_p, x, w) \to \pi$
Verify: $\text{Verify}(S_v, x, \pi) \to \top/\bot$

$x$ is the public input, $\pi$ is proof, $S_p$ and $S_v$ is random parameter send to prover and verifier.

graph LR
subgraph SNARK
S(Setup C) -->|Sv| V(Verify Sv, x, pi)
S -->|Sp| P(Prover Sp, x, w)
P -->|pi| V
end

A robust ZK proof system must satisfy

Completeness: $\text{Verify}(S_v, x, \text{Proof}(S_p,x,w)) = \top \Leftrightarrow C(x,w) = 0$.
Proof of Knowledge: Extractor
Succinctness: $|\pi| = O(\log|C|)$, $\text{time}(V) = O(|x| + \log|C|)$, $\text{time}(P) = O(|C|)$.
Zero Knowledge.

PCP and Kilian SNARK

PCP, a.k.a. Probabilistically Checkable Proof in 1990 indicates that all NP problems can be verified in polynomial time with probabilistic check.

Since all the NP problems can be converted into a circuit $C$, we can construct a probabilistic check system $(S, P, V)$ by

graph LR
P(Prove C, x, w) -->|pi| Pf["Proof pi: size O(C)"]
subgraph Read Only
Pf
end
V(Verify Sv, x) -->|"O(k) query"| Pf
Pf -->|"O(k) bits"| V

Such a random check system must satisfy

Completeness: If verifier is honest and $C(x,w) = 0$, then verifier will output $\top$.
Soundness: If prover is not honest and it does not have witness $w$ such that $C(x, w) = 0$, then $\Pr\lbrack V = \top\rbrack = \text{negl}$.

But the construction above is too expensive since it does not have succinctness in proof. We can rely on Merkle commitment and Merkle proof so that prover can’t be cheating.

graph LR
P(Prove C, x, w) -->|Commitment C-pi| Cm["Commitment C-pi"]
subgraph Read Only
Cm
end
V(Verify Sv, x) -->|"O(k) query"| P
V -->|"O(k) query"| Cm
P -->|"O(k) bits"| V
Cm -->|"Merkle Proof for bits"| V
subgraph Verify
V
end

The proof size $|\pi|$ is made to $O(\log |C|)$.

Interactive to Non-Interactive

In the constructions above, we assume that prover is always online so that verifier can ask for interactive proofs.

We can remove the interaction with random oracle:

Prover generates Merkle commitment $c_\pi$ and sends it to a public read only location.
Prover relies on a random oracle to generate $r_1 \gets \text{rand}(x \parallel c_\pi)$.
Prover computes $m_1 \gets \pi$ on $r_1$.
Prover generates $r_2 \gets \text{rand}(x \parallel c_\pi \parallel m_1)$.
Prover computes $m_2 \gets \pi$ on $r_2$.
Repeat to $O(k)$ query length.

Once the prover sends $(m_1 \parallel \ldots \parallel m_k)$ to verifier, the verifier can relies on public random oracle and $c_\pi$ to examine the hash.

R1CS and QAP

If we want to prove that we have a solution for something without sending all these solutions in clear to verifier, we will need these techniques.

We constrain the context into proving we know solution for a polynomial function $f(x_1,\dots, x_k)$.

R1CS Definition

The relation $\mathcal{R}_{\text{R1CS}}$ consists of a set of all pairs $((\mathbb{F}, k, m, n, A,B,C,v), w)$ where

$\mathbb{F}$ is a finite field
$k$ be the number of inputs
$n$ be the number of variables
$m$ be the number of constraints
$A,B,C\in\mathbb{F}^{m\times (n+1)}$
$v \in \mathbb{F}^{k}$
$w \in \mathbb{F}^{n-k}$

If a pair is in such relation set, it must satisfy $(A \cdot z) \circ (B\cdot z) = C\cdot z$ where $z = (1, v,w)$ and $\circ$ stands for Hadamard Production (entry-wise product).

R1CS and Polynomial, or Arithmetic Circuit

So, obviously, a polynomial of form $f(x_1, \dots, x_k)$ with operation $+,\times$ can be written or flatten into an arithmetic circuit with binary gates of $+,\times$.

By R1CS, a gate means one constraint. So each entry of $A,B,C$, donated by $A_i,B_i,C_i$ where $i \in \lbrack 1,m \rbrack$ must satisfy $\langle A_i, z\rangle \cdot \langle B_i ,z\rangle = \langle C_i,z\rangle$.

In this way, the matrix-vector production checks each matrix entry for $m$ times by $(A_1,\dots, A_m)^\top\cdot z \equiv (\langle A_1,z\rangle,\dots,\langle A_m,z\rangle)^\top$.

The problem then lies in each entry, why we need to satisfy $\langle A_i, z\rangle \cdot \langle B_i ,z\rangle = \langle C_i,z\rangle$.

We let the $\langle C_i,z\rangle$ be the output for this gate, where $\langle C_i,z\rangle \in \lbrace w_i\rbrace$.

The left side of the previous entry equation represents the computation of the gate.

If the gate is an add gate $+$, then $\langle A_i,z\rangle$ can represent the addition, while $\langle B_i,z\rangle$ represents the constant 1. Such addition gate is satisfied. (vise versa for $A,B$)
If the gate is an mult gate $\times$, then $\langle A_i,z\rangle$ represents the left input, while $\langle B_i,z\rangle$ represents the right input. The scalar multiplication does the gate computation.

R1CS to QAP

So far, we have $A,B,C \in \mathbb{F}^{m\times (n+1)}$ and $z \in \mathbb{F}^{n+1}$. If we want to hide $z$ witness better, then we need to transfer R1CS to QAP.

For matrix $M \in \mathbb{F}^{m\times(n+1)}$, we can represent the columns as $M_i(x)$ for $i \in \lbrack 1, n+1\rbrack$ with a corresponding $\lbrace x_i\rbrace$. Then $M = (M_1(x),\dots,M_{n+1}(x))\ |_{x_1,\dots,x_m}$.

If we transform $A,B,C$ to a row of polynomials $A_p,B_p,C_p$, then we have $\langle A_p^\top,z\rangle \cdot \langle B_p^\top,z\rangle =_{\text{polynomial}} \langle C_p^\top,z\rangle$ to be checked.

The detail for such transform lies in Lagrange Polynomial Interpolation.

The polynomial $\langle A_p^\top , z \rangle \cdot \langle B_p^\top ,z\rangle - \langle C_p^\top , z \rangle$ should be able to be divided by $\prod (x - x_i)$ without remainder, which means $\exists H(x), \langle A , z \rangle \cdot \langle B,z\rangle - \langle C , z \rangle = H(x) \prod (x - x_i)$.

It seems more interesting to have QAP than R1CS since the elements are all polynomials.

The equation under polynomial can be rewritten to pure equation, where each entry in $A_p^\top,B_p^\top$ is of order $m-1$ and entries of $C_p^\top$ are of order $2m-2$ (it will require additional samples on $A_p,B_p$).

Check this link.

Schwartz-Zippel Lemma

Consider polynomials over $\mathbb{F}_p$, $P(X_1,\dots,X_n)$. They compute functions $\mathbb{F}_p^n \to \mathbb{F}_p$. There exists $p^{p^n}$ such functions, but there are infinitely many polynomials.

$X_1,\dots,X_n$ has $p^n$ and output $p$ probabilities. Thus $p^{p^n}$ such functions exists.

In this way, we can have multiple non identical polynomials representing the same function. The differentiation of these polynomials is zero function.

E.g., $X^p - X$ for $\mathbb{F}_p$.

Given any $P \in \mathbb{F}_p \lbrack X_1,\dots,X_n\rbrack$ can reduce each $X^m$ to $X^{m\mod p}$. So these individual degrees (for each variable) are smaller than or equal to $p - 1$.

Thus the total reduced degree (the largest degree of monomial) is $p^n$, and the reduced polynomial number is $p^{p^n}$.

In this way, we let $P \in \mathbb{F}_p\lbrack X_1,\dots,X_n\rbrack$ be reduced nonzero polynomial of total degree $\le d < p$. We also let $\alpha_1,\dots,\alpha_n \overset{R}{\leftarrow} \mathbb{F}_p$. Then $\Pr\lbrack P(\alpha_1,\dots,\alpha_n) = 0\rbrack \le \dfrac{d}{p}$.

If $p = 2$ then $\Pr\lbrack P(\alpha_1,\dots,\alpha_n) \neq 0\rbrack \ge \dfrac{1}{2^d}$.

We assume the theorem is true for $n-1$ variables. Let $k$ be the largest individual degree for $X_1$ in any monomial, then $P(X_1,\dots,X_n) = \sum\limits^k_{i=0} X_1^i \cdot q_i(X_2, \dots, X_n)$.

$q_k$ degree is at most $d - k$, by induction $\Pr\lbrack q_k(y_2,\dots,y_n) = 0\rbrack \le \dfrac{d - k}{p}$ for $y_2,\dots,y_n \overset{R}{\leftarrow} \mathbb{F}_p$.

Donate $q_k(y_2,\dots,y_n) = 0$ as event $\varepsilon_1$, $f(X_1) = P(X_1, y_2,\dots,y_n)$. By induction we also have $\Pr\lbrack f(y_1) = 0 | \neg \varepsilon_1\rbrack \le \dfrac{k}{p}$ for $y_1 \overset{R}{\leftarrow} \mathbb{F}_p$.
$$
\begin{aligned}
\Pr\lbrack f(y_1 ) = 0\rbrack &= \Pr\lbrack f(y_1 ) = 0 | \varepsilon_1\rbrack \cdot \Pr \lbrack \varepsilon_1\rbrack + \Pr\lbrack f(y_1 ) = 0 | \neg \varepsilon_1\rbrack \cdot \Pr \lbrack \neg \varepsilon_1\rbrack\newline
&\le \Pr\lbrack f(y_1 ) = 0 | \varepsilon_1\rbrack \cdot \dfrac{d- k}{p} + \dfrac{k}{p} \le \dfrac{d}{p}
\end{aligned}
$$

QAP to LPCP

We denote public inputs to be $\boldsymbol{x}$ and private inputs to be $\boldsymbol{w}$. We have proof to be $\pi = \left[ 1,\boldsymbol{x},\boldsymbol{w} \right]^\top$.

By previous QAP construction, we want to hide the R1CS coefficients so that verifiers cannot guess the proof private inputs.

Denote $t \overset{R}{\leftarrow}\mathbb{F}$ and $A_t,B_t,C_t, H_t$ be the QAP polynomial matrices evaluated on $t$. Then we want to satisfy that $\langle A_t, \pi\rangle \cdot \langle B_t,\pi\rangle - \langle C_t , \pi \rangle = H_t \cdot \prod (t - x_i)$.

We can sample a random query based on $A_p$ and $A$ by following method (denote $A_r$ to be $A_t$ with $t$ evaluated on $r$:
$$
A_r = \begin{bmatrix} 1&r & \ldots & r^{m-1} \end{bmatrix} A_t,
$$
and similarly on $B_r$ and $C_r$.

PCP is quite general since it just specify thats problem in NP scope can be verified with a random sample.

With such method, a soundness amplification can be done by multiple round of random sample verification, the probability of cheating is close to negliable.

Observing that verifier has circuit, and thus can construct from R1CS to QAP and finally these three queries. Almost all the computations are on verifier side.

LPCP Protocol

Noticing that previously $\langle A_r, \pi\rangle \cdot \langle B_r,\pi\rangle - \langle C_r, \pi \rangle = H_r \cdot \prod (r - x_i)$, where $\pmb{\pi} = \lbrack 1, \pmb{x},\pmb{w}\rbrack^\top$. We can separate these queries into $q^L$ and $q^R$, where $q^L$ corresponds to $\lbrack 1, \pmb{x}\rbrack^\top$ and $q^R$ corresponds to $\pmb{w}$.

We just need to send $A_r^R,B_r^R,C_r^R$ to prover, and receive $a = \langle A^R, \pi \rangle, b = \langle B^R, \pi \rangle, c = \langle C^R, \pi \rangle$.

Verifier needs to check if
$$
(\langle A^L, \lbrack 1, x\rbrack \rangle + a) \cdot (\langle B^L, \lbrack 1, x\rbrack \rangle + b) \overset{?}{=} (\langle C^L, \lbrack 1, x\rbrack \rangle + c) + H_r \cdot \prod (r - x_i)
$$

Common Reference String Model (CRS)

The LPCP can become non-interactive, based on trusted setup. One thing we really need is an assumption called CRS.

If we can randomly generate the queries $A_r,B_r,C_r$ on verifier side, we can pre-generate these queries and throw them into CRS.

In this way, the interactive PCP becomes non-interactive PCP, the prover just send over its proof based on CRS.

The CRS hiding relies on linear-only encoding, which is additive homomorphic encryption.

FFT and Polynomial

For previously talked about QAP, we have to interpolate polynomials. We are transforming $A \cdot z$ to $\langle A_p, z\rangle$ where $A_p$ is the vector of polynomials for $A$ over $x_1,\dots,x_m$ for $m$ constraints.

$A_p = (A_{p_0}(x), \dots, A_{p_{n+1}}(x))$, so $A_p|_{x_1,\dots,x_m} = (A_p(x_1),\dots,A_p(x_m))^\top$.

For every column, we get a $A_{p_i}$ for some $i \in \lbrack 1 ,m \rbrack$. Consider for one $A_{p_i}$ (to represent $A$ column $i$) we have
$$
\begin{bmatrix}
1 & x_1 & \dots &x_1^{m-1} \newline
\vdots & \vdots & &\vdots \newline
1 & x_m & \dots &x_m^{m-1}
\end{bmatrix} \cdot
\begin{bmatrix}
a_0 \newline \vdots \newline a_m
\end{bmatrix} =
A\text{ column }i
$$

The basic idea was to use $g(x) = \sum\limits^{m-1}_{i=0} f_i I_i(x)$, where $I_i(x_j) = \begin{cases} 0 &i \neq j \newline 1 & i = j \end{cases}$.

We can have $I_i(x) = \prod\limits_{k \in \lbrack 0, m-1 \rbrack \backslash \lbrace i \rbrace}\dfrac{x - x_k}{x_i - x_k}$.

Still, to evaluate such $\prod (x - x_k)$ will need $O(n^2)$ if we go brute force.

FFT/IFFT and Polynomial Multiplication/Interpolation

If we want the brute force $A(x) \cdot B(x)$ or do interpolation for some $A(x)$, then it is obvious that $O(n^2)$.

We consider $n$ to be the power of 2 for simplicity.

Consider $\omega_n$ be a root of unity, then it has feature $\omega_n^n \equiv 1$. In this way we have

$\omega^{2k}_{2n} = \omega^k_n$.
$\omega_n^{k + \frac{n}{2}} = -\omega_n^k$.

Considering interpolation, where we are considering $\lbrack A(\omega_n^0),\dots,A(\omega_n^{n-1})\rbrack$ for $x = \omega_n^0,\dots,\omega^{n-1}_n$.

We can separate $A=\sum\limits^{n-1}_{i=0}a_i x^i$ into $A = A^{\lbrack 1\rbrack}(x^2) + xA^{\lbrack 2\rbrack}(x^2)$ where

$A^{\lbrack 1\rbrack}(x^2) = \sum\limits_{i = 0,2,\dots,n-2} a_i x^i = \sum\limits_{i = 0,2,\dots,n-2} a_i (x^2)^{\frac{i}{2}}$. So $A^{\lbrack 1\rbrack} = \sum\limits_{i = 0,2,\dots,n-2} a_i x^{\frac{i}{2}}$.
$xA^{\lbrack 2\rbrack}(x^2) = \sum\limits_{i = 1,3,\dots,n-1} a_i x^i = x \sum\limits_{i = 0, 2, \dots, n-2}a_{i+1}x^i = x \sum\limits_{i = 0, 2, \dots, n-2}a_{i+1}(x^2)^{\frac{i}{2}}$. So $A^{\lbrack 2\rbrack} = \sum\limits_{i = 0, 2, \dots, n-2}a_{i+1}x^{\frac{i}{2}}$.

If we want to know $A(\omega_n^k)$, we do $A(\omega_n^k) = A^{\lbrack 1 \rbrack} (\omega^k_{\frac{n}{2}}) + \omega_n^k A^{\lbrack 2\rbrack}(\omega^k_{\frac{n}{2}})$. ($k \in \lbrack 0, \dfrac{n}{2} - 1\rbrack$)

If we want to know $k + \dfrac{n}{2}$ scenario, $A(\omega_n^{k + \frac{n}{2}}) = A^{\lbrack 1 \rbrack} (\omega^k_{\frac{n}{2}}) - \omega_n^k A^{\lbrack 2\rbrack}(\omega^k_{\frac{n}{2}})$.

If we know $A^{\lbrack 1\rbrack},A^{\lbrack 2\rbrack}$ for $\omega_{\frac{n}{2}}^k$ $\forall k \in \lbrack 0, \dfrac{n}{2} - 1\rbrack$, we can solve $\forall k \in \lbrack 0, n-1 \rbrack, A(x)|_{x = \omega_n^k}$ in $O(n)$.

By $T(n) = 2 T(\dfrac{n}{2}) + O(n)$ we have $T(n) = O(n\log n)$.

IFFT is used to get $(a_0,\dots,a_{n-1})$ coefficients by $(A(\omega_n^{0}),\dots,A(\omega_n^{n-1}))$, which is used for fast interpolation.

Since we the required value $\lbrace A(\omega_n^i)\rbrace$ is given by the R1CS, then we don’t need to worry about these computations for $\lbrace A(\omega_n^i)\rbrace$.

IFFT follows a similar step like FFT. We construct polynomial $F(x) = \sum\limits^{n-1}_{i=0} d_i x^i$ where $\lbrace d_i\rbrace = (A(\omega_n^{0}),\dots,A(\omega_n^{n-1}))$.

We use the polynomial to FFT get $(c_0, \dots,c_{n-1})$ for $(F(\omega_n^{0}),\dots,F(\omega_n^{-(n-1)}))$ or $\lbrace F(\omega_n^{-k})\rbrace$.

$c_k = \sum\limits^{n-1}_{i = 0} \lbrack \sum \limits^{n-1}_{j=0} a_j \cdot (\omega_n^i)^j\rbrack \cdot (\omega^{-k}_n)^i = \sum\limits^{n-1}_{i = 0}\lbrack \sum \limits^{n-1}_{j = 0} a_j \cdot (\omega_n^i)^{j-k}\rbrack$. It is obvious that $c_k = n a_k$. Thus $\lbrace a_k \rbrace = \lbrace \dfrac{c_k}{n}\rbrace$.

IFFT takes also $O(n\log n)$ like FFT.

The polynomial multiplication brute force can be considered as $A:a_0,\dots,a_{n-1}, B:b_0,\dots,b_{n-1}$ to $C: c_0,\dots,c_{2n-1}$ which takes $O(n^2)$.

With FFT/IFFT we can finish following process:

$A:a_0,\dots,a_{n-1}, B:b_0,\dots,b_{n-1}$ to $\lbrace A(\omega_{2n}^k)\rbrace, \lbrace B(\omega_{2n}^k)\rbrace$ takes $O(n\log n)$.
$\lbrace A(\omega_{2n}^k)\rbrace, \lbrace B(\omega_{2n}^k)\rbrace$ to $\lbrace C(\omega_{2n}^k)\rbrace$ takes $O(n)$.
$\lbrace C(\omega_{2n}^k)\rbrace$ to $C : c_0 ,\dots,c_{2n-1}$.

So the total cost comes to $O(n\log n)$.