Arkana Logoarkana
F
Frank Mangone
June 25, 2024 · 17 min read · Medium

This is part of a larger series of articles about cryptography. If this is the first article you come across, I strongly recommend starting from the beginning of the series.

In the past few articles, we've covered a lot of building blocks. It's time to finally combine all these lego pieces into a protocol. Hooray!

Remember, the reason we made such an effort to understand these moving parts was to try and build a general framework for zero knowledge proofs — because, as we saw, creating a protocol for a specific application was somewhat impractical, requiring specific research.

We're now ready to introduce a family of knowledge proofs that utilizes every element we've preemptively defined: SNARKs.

Specifically, we'll be looking at a scheme called Plonk. For a full disclosure of the technique, go here. I'll try my best to explain this as precisely as possible. Moreover, this is not the only protocol that qualifies as a SNARK out there. Other mechanisms such as Groth16, Marlin, and Halo exist. I might tackle those in the future, but for now, I'll just leave a couple links here in case you want to follow your curiosity!

Disclaimer

This article is gonna be long, and honestly, complex. As in, probably harder than usual. There's only so much I can do to simplify the explanation for a protocol that is pretty darn complex.

But, if I had to put this entire process into a single phrase, it would be:

In Plonk, we encode the computation of a circuit into a series of polynomials, which represent the constraints imposed by the circuit, for which we can prove statements by using a series of tests, without ever revealing the polynomials themselves — and thus, not leaking the secret inputs.

Getting to fully understand this will be a wild ride. And there are several elements that we'll need to cover. Here's an overview of the plan, which doubles up as a sort of table of contents:

Also, I'm assuming you already checked out the article about polynomial commitment schemes, and also the one about arithmetic circuits. If you haven't, I strongly recommend reading them!

Without further ado, here we go!

What is a SNARK?

The acronym stands for Succint Non-interactive ARguments of Knowledge. In plain english: a mechanism to prove knowledge of something, that's succint (or short, or small) and non-interactive. There are a couple things that we need to dissect here, to better understand the goal of the construction:

  • Succint means that whatever proof we produce must be small in size, and fast in evaluation time.
  • Non-interactive means that upon receiving the proof, the verifier will not need any further interaction with the prover. More on this later.
  • Finally, we say that this is a knowledge proof, but not necessarily a zero knowledge proof (although it can be). In particular, Plonk qualifies as zero knowledge because of how it's constructed.

Fret not — we cannot build our SNARK yet. We'll need to cover a few concepts first.

Revisiting Circuits

In the previous article, we saw how arithmetic circuits were a nice way to represent a computation. And how they allowed the crafting of recipes for validation of statements.

And so, the prover's goal will be to try and convince a verifier that they know some secret value xx — really, a vector of values — such that:

 xFpm / C(x,w)=0\exists \ x \in {\mathbb{F}_p}^m \ / \ C(x,w) = 0

This reads: "there exists some vector xx whose elements are in the finite field Fp\mathbb{F}_p, such that C(x,w)=0C(x, w) = 0, where ww is publicly known.

The prover doesn't want to reveal xx, but also, we don't want the verifier to run the expensive computation of the circuit. And so, what will really happen is that the prover will evaluate the circuit, and then somehow encode both the inputs, and the results for each of the gates — also called the computation trace.

Here's an example evaluation, using the field F113\mathbb{F}_{113}:

A small circuit example
Click to zoom
Remember that we're working with modulo p = 113 in this case

And we could think of this evaluation as the following trace, visualized as a table:

w1w_1x1x_1x2x_2
inputs20135
left inputright inputoutput
gate 0520100
gate 1131000

The fantastic idea behind modern SNARKs, including Plonk, is to encode this trace as polynomials, which will ultimately allow a verifier to check that the computation is valid. Valid here means that a specific set of constraints are followed. Namely:

  • That each gate is correctly evaluated
  • That wires with the same origin have the same value

By wire, I mean the connections between gates, or connections from inputs to gates, in the circuit. We can think of these wires as holding the values of the circuit, instead of the nodes themselves:

A circuit showing wire values
Click to zoom

Here, you can see that some of these wires should hold the same value: W0W_0, W1W_1, and W2W_2 — and also W4W_4 and W5W_5. Additionally, gates should make sense — meaning that, for instance, W0+W1W_0 + W_1 should equal W4W_4.

A circuit can be either a recipe for computation, or a set of constraints to check!

Each of this sets of constraints — wire values, gate constraints, and wiring constraints — , will be encoded in different polynomials.

For example, the entire computation trace can be encoded into a single polynomial PP, where P(xi)P(x_i) corresponds to one wire.

P(xi)=WiP(x_i) = W_i

We have the wire values WiW_i, but we need to choose to which values xix_i we encode them. Using some integers is a perfectly valid option — you know, the set {0,1,2,3,N}\{0,1,2,3…,N\}. But there’s a lot to gain from using the roots of unity of our field Fp\mathbb{F}_p instead.

A toast, confused
Huh?

Roots of Unity

I mentioned this concept before, but dodged having to explain any further like Neo in Matrix.

I think now is good time to zoom in a bit, so that we can better understand the notation coming ahead.

So, yeah, time for some definitions. We call a value ω\omega (the greek letter omega) a kthk-th root of unity of the field Fp\mathbb{F}_p if:

ωk=1\omega^k = 1

In this case, the set HH:

H={1,ω,ω2,ω3,...,ωk1}FpH = \{1, \omega, \omega^2, \omega^3, ..., \omega^{k-1}\} \subseteq \mathbb{F}_p

doubles up as a cyclic multiplicative group, generated by ω\omega:

H=ωH = \langle \omega \rangle

There are two nice things about using elements of this set as the inputs to our encoding polynomial P(X)P(X). First, moving from one root of unity to the next, is as simple as** multiplying by** ω\omega. Likewise, moving backwards is done by multiplying by the inverse of ω\omega. And it wraps all the way around - by definition:

ωk1ω=1\omega^{k-1}\omega = 1 ω1=ωk1\omega^{-1} = \omega^{k-1}

Secondly — and this is the most important part — , using this set allows us to perform interpolation using the most efficient algorithm we know: the Fast Fourier Transform. We won’t dive into how it works (at least not now!), but know that this improves interpolation time dramatically — meaning that proofs are faster to generate (than other methods for interpolation).

Encoding the Circuit

Having said all this, it’s time to get down to business. Time to actually see how the computation trace is encoded.

Let’s denote the number inputs to our circuit by I|I|, and the number of gates by C|C|. With this, we define:

d=3C+Id = 3|C| + |I|

Each gate has three associated values: two inputs, and one output. And this is why we use 3C3|C|. Notice that this magic number dd happens to be exactly the number of values in the computation trace!

We’ll encode all these values into the dthd-th roots of unity:

H={1,ω,ω2,ω3,...,ωd1}H = \{1, \omega, \omega^2, \omega^3, ..., \omega^{d-1}\}

But which root encodes to which wire? We need a plan!

  • The I|I| inputs will be encoded using negative powers of ω\omega. For the input #j\#j, we use:
P(ωj)P(\omega^{-j})
  • The three wires associated to each gate kk, ordered as left input, right input, and output, will be encoded by the following values:
P(ω3k),P(ω3k+1),P(ω3k+2)P(\omega^{3k}), P(\omega^{3k + 1}), P(\omega^{3k + 2})

If we look at our sample trace, we’d get something like this:

P(ω1)=20,P(ω2)=13,P(ω3)=5P(\omega^{-1}) = 20, P(\omega^{-2}) = 13, P(\omega^{-3}) = 5 P(ω0)=5,P(ω1)=20,P(ω2)=100P(\omega^0) = 5, P(\omega^1) = 20, P(\omega^2) = 100 P(ω3)=13,P(ω4)=100,P(ω5)=0P(\omega^3) = 13, P(\omega^4) = 100, P(\omega^5) = 0

With this, all the values in our computation trace are included in the polynomial. We just have to interpolate, and obtain P(X)P(X), which will have degree d1d - 1.

Gate Encoding

Gates present a complication: they can be either addition or multiplication gates. This means that proving that a gate is correctly evaluated, requires us to convey information about its type.

We do this via a selector polynomial S(X)S(X). For gate kk:

S(ω3k)=0 or S(ω3k)=1S(\omega^{3k}) = 0 \ \textrm{or} \ S(\omega^{3k}) = 1

When the gate kk is an addition gate, then S(X)S(X) will take the value 11, and if it is a multiplication gate, then it will take the value 00. To make this simple to write, let’s just define:

α=ω3k\alpha = \omega^{3k}

And then construct the following expression:

S(α)[P(α)+P(ωα)]+(1S(α))P(α)P(ωα)=P(ω2α)S(\alpha)[P(\alpha) + P(\omega \alpha)] + (1 - S(\alpha))P(\alpha)P(\omega \alpha) = P(\omega^2 \alpha)

Don’t let the expression scare you — what’s happening is actually fairly straightforward. You see, either S(α)=0S(\alpha) = 0 or 1S(α)=01 - S(\alpha) = 0. Because of this, only one of the terms containing S(X)S(X) will be active for some gate kk, and in consequence, this expression ties inputs (P(α)P(\alpha) and P(ωα)P(\omega \alpha)) and outputs (encoded in P(ω2α)P(\omega^2 \alpha)) of a gate, along with the gate type.

Wiring Encoding

This one is the trickiest of the lot. As mentioned before, it may happen that some wires correspond to the same value, since they come from the same source, like this:

Shared gate sources
Click to zoom

What this means is that some of the values encoded into P(X)P(X) need to match:

P(ωa)=P(ωb)=P(ωc)=...P(\omega^a) = P(\omega^b) = P(\omega^c) = ...

If we analyze the entire circuit, we’ll end up having a set of this kind of constraints:

{P(ω1)=P(ω1)P(ω2)=P(ω3)P(ω3)=P(ω0)P(ω2)=P(ω4)\left\{\begin{matrix} P(\omega^{-1}) = P(\omega^1) \\ P(\omega^{-2}) = P(\omega^3) \\ P(\omega^{-3}) = P(\omega^0) \\ P(\omega^2) = P(\omega^4) \end{matrix}\right.

This is for the example above. But in general, the equalities may have more than 22 members each

Each one of these must be somehow encoded, of course, into polynomials. They way we do this is quite interesting: for each constraint, we define a polynomial that permutates or rotates the subset of roots whose evaluation should match. So for example, if the condition is:

P(ωp)=P(ωq)=P(ωr)=P(ωs)P(\omega^p) = P(\omega^q) = P(\omega^r) = P(\omega^s)

Then we define the subset:

H={ωp,ωq,ωr,ωs}H' = \{\omega^p, \omega^q, \omega^r, \omega^s\}

And using HH’, we create a polynomial W(X)W(X) that has the following behavior:

W(ωp)=ωqW(ωq)=ωrW(ωr)=ωsW(ωs)=ωp\\ W(\omega^p) = \omega^q \\ W(\omega^q) = \omega^r \\ W(\omega^r) = \omega^s \\ W(\omega^s) = \omega^p
Rotation of powers of the involved roots
Click to zoom
It essentially cycles through the subset. This is the part that gives Plonk its name, actually!

Because W(X)W(X) always returns “the next” element in HH’, and since all the values of P(X)P(X) should be equal for the roots on HH’, the way we prove that the wiring constraint holds is by proving that:

P(x)=P(W(x))xHP(x) = P(W(x)) \forall x \in H'

This has to be done for every element in HH’, thus covering all the equalities in a single constraint.

Okay! That was certainly a lot. Nevertheless, we’ve already covered a lot of ground with this.

At this point, the prover has all these polynomials that encode the entire computation trace. What’s missing then? Only the most important part: convincing a verifier that the trace is correct. Once we understand how this happens, everything will finally be tied together.

Verification Requirements

Of course, the verifier doesn’t know the polynomials we just talked about. In particular, it’s critical that they never learn the full P(X)P(X). The reason for this is that, if they do get ahold of it, then they could easily uncover the secret information xx, by calculating:

P(ωj)P(\omega^{-j})

The ability to never reveal these values by using polynomials is what gives Plonk its zero knowledge properties!

If the verifier cannot know the polynomials, then how can we convince them of anything? Did we hit a dead end? Should we panic now?

Suspicious Pikachu
I suspect you already know this narrative trick

While the verifier cannot ask for the full polynomials, they for sure can ask for single evaluations of them. They should be able to ask for any value of P(X)P(X), S(X)S(X), or the W(X)W(X)'s — meaning they should be able to query any part of the computation trace. Except for the secret inputs, of course.

It’s at this point that our PCSs of choice comes in handy: when requesting the value of P(X)P(X) at some point bb (so, P(b)P(b)), the verifier can check that it’s correctly computed by checking it against a commitment! Noooice.

Why would they do that, though? To check that the constraints hold, that is!

To be convinced that the computation trace is correct, the verifier needs to check that:

  • The output of the last gate is exactly 00 (this is a convention),
  • The inputs (the public ones, or the witness) are correctly encoded,
  • The evaluation of each gate is correct (either addition or multiplication holds),
  • The wiring is correct.

The first one is easy — the verifier just asks for the output at the last gate, which is:

P(ω3C1)=0P(\omega^{3|C| - 1}) = 0

For the other checks though, we’ll need to sprinkle in some magic (as usual, at this point).

Salt bae meme
I never really understood this guy, but hey, it makes for a good meme

The Last Sprinkles of Cryptomagic

We’ll need to sidetrack a little for a moment. Here’s a question: given some polynomial of degree at most dd, and that is not identically 00 (so f(x)0f(x) \neq 0), how likely would it be that for a random input rr (sampled from the integers modulo pp), we obtain that f(r)=0f(r) = 0?

Because the polynomial has degree at most dd, it will have at most dd roots. Then, the probability that f(r)=0f(r) = 0 is exactly the probability that rr happens to be a root of ff:

P[f(r)=0]d/p\mathcal{P}[f(r) = 0] \leq d/p

And provided that pp is sufficiently larger than dd, then this is a negligible value (very close to zero). This is important: if we randomly choose some rr and obtain that f(r)=0f(r) = 0, then we can say with high probability that f(x)f(x) is identically zero (the zero function)!

This is known as a zero test. It doesn’t seem to amount to much, but we’re gonna need this to make our SNARK work.

There are a couple more checks that we can perform, namely an addition check, and a product check. This is already quite a lot of information to digest, so let’s skip those for now.

What interesting is that there are efficient Interactive Oracle Proofs to perform these tests.

Sorry... Interactive WHAT?

Sonic brain melting
*Meltdown*

Interactive Oracle Proofs

I warned you this wasn’t gonna be easy! Just a little bit more, and we’re done.

Interactive Oracle Proofs (IOPs) are essentially a family of mechanisms, by which a prover and a verifier interact with each other, so that the prover can convince the verifier about the truth of a certain statement. Something like this:

Interactive Oracle Proof interaction diagram
Click to zoom

I hope you can already see how this picture looks similar to the one we used to describe Polynomial Commitment Schemes (PCSs)!

And we’ll use this model to perform our tests. For illustrative purposes, let’s describe how a zero test would go.

Imagine you want to prove that some set SS is a collection of roots of a given polynomial P(X)P(X):

S={s0,s1,s2,...,sn1}S = \{s_0, s_1, s_2, ..., s_{n-1}\}

What can you do about it? Well, since the values in SS are roots, you can divide the polynomial P(X)P(X) by what’s called a vanishing polynomial for the set SS:

V(X)=i=0n1(Xsi)V(X) = \prod_{i=0}^{n-1} (X - s_i)

The term vanishing is due to the fact that the polynomial is zero only on the set SS, so they are exactly its roots.

If SS really are the roots of P(X)P(X), then PP should be divisible by V(X)V(X), with no remainder.

Q(X)=P(X)/V(X)Q(X) = P(X) / V(X)

And now, we’re gonna have to use a Polynomial Commitment Scheme to continue. You might have to read that article first!

Essentially, what happens is that the prover commits to both P(X)P(X) and Q(X)Q(X), and then, they request an evaluation at some random sis_i. With these evaluations, they can check whether if the following is true:

Q(si).V(si)P(si)=0Q(s_i).V(s_i) - P(s_i) = 0

If it happens to be zero, then we saw that they can conclude with high probability that this polynomial $Q(X)V(X) - P(X)* is exactly zero! Meaning that:

Q(X)V(X)=P(X)Q(X)V(X) = P(X)

Thus, they can be convinced that, effectively, SS is a set of roots of P(X)P(X). If for some reason they aren’t convinced though, they could ask for another set of evaluations of P(s)P(s) and Q(s)Q(s), and run the check again.

Zero test in action
Click to zoom

Similar IOPs exist for the addition check and the product check.

Also worth mentioning, this does not ensure that the polynomial doesn’t have other roots that are not contained in SS. But we’ll not go down that path this time around!

From Interactive to Non-interactive

But wait... Didn’t we say that SNARKs were non-interactive? Then how is it possible that some interactive protocol is a key part of our construction?

Turns out there’s a way to turn interactivity into non-interactivity: the Fiat-Shamir transform. It sounds more daunting than what it is, trust me.

If we think about it, we may wonder “why are these protocols interactive in the first place?” The reason is that on each query, the verifier samples some random number rir_i from some set. And these values cannot be predicted by the prover — they only become available for them when the verifier chooses to execute a query. This unpredictability is sort of an anti-cheat mechanism.

Instead of waiting for the verifier to send random values, we can simulate this randomness using a well-known cryptographic primitive that also has an unpredictable output: hashing functions!

Let’s not focus on the details though — all you need to know is that the Fiat-Shamir heuristic is a powerful transformation, that can be very useful to turn any interactive protocol into a non-interactive one!

After what can only be categorized as torture, we have all the elements we need. Our excruciating journey is almost over — all that remains is putting the cherry on top of this pile of shenanigans.

Dumbledore crying after drinking from the chalice to obtain Slytherin's Locket
Just end my suffering already

Back to Verification

Okay, let’s see. Remember, we’re trying to convince a verifier that we know xx such that:

 xFpm / C(x,w)=0\exists \ x \in {\mathbb{F}_p}^m \ / \ C(x,w) = 0

And we have to convince them of a few things:

  • Correct inputs
  • Correct gate computation
  • Correct wiring

Let’s start with the inputs. The verifier has the public witness, ww. Now, imagine the verifier encodes this witness into a polynomial v(X)v(X), so that:

v(ωj)=wjv(\omega^{-j}) = w_j

In this scenario, it should be true that values v(x)v(x) and P(x)P(x) match for the roots of unity that encode the public witness:

P(ωj)v(wj)=0P(\omega^{-j}) - v(w_j) = 0

So what do we do? Why, of course, a zero test for the polynomial P(X)v(X)P(X) - v(X) on the inputs that encode the witness!

Brain explosion

Ahá! It’s starting to come together, isn’t it?

Checking the Gates

Likewise, recall that the gates should suffice the expression:

α=ω3k\alpha = \omega^{3k} S(α)[P(α)+P(ωα)]+(1S(α))P(α)P(ωα)=P(ω2α)S(\alpha)[P(\alpha) + P(\omega \alpha)] + (1 - S(\alpha))P(\alpha)P(\omega \alpha) = P(\omega^2 \alpha)

So again, what do you do? A zero test on this massive expression, on every gate. Marvelous. Splendid.

For this of course, the verifier will need a commitment to S(X)S(X).

Checking the Wires

Lastly, we need to check the wire constraints. These were encoded into some polynomials W(X)W(X) that cycled through a set of inputs HH’, and we had to check that:

P(x)=P(W(x))xHP(x) = P(W(x)) \forall x \in H'

For each element in HH’. This is not really efficient to do with zero tests (although it can be done), so there’s an alternative way to do it through the use of a product check. Using this casually dropped expression:

L(Y,Z)=xHP(x)+Y.W(x)+ZP(x)+Y.x+ZL(Y, Z) = \prod_{x \in H'} \frac{P(x) + Y.W(x) + Z}{P(x) + Y.x + Z}

Yeah

The gist of this is that the whole expression should equal 11! If you think about it, it makes perfect sense: all the P(x)P(x) values should be the same, and since W(X)W(X) permutates the elements of HH’, and because the product covers all of HH’, we just get:

L(Y,Z)=xHP(x)+Y.W(x)+ZP(x)+Y.x+ZL(Y, Z) = \prod_{x \in H'} \frac{P(x) + Y.W(x) + Z}{P(x) + Y.x + Z} =xHP(x)+Y.W(x)+ZxHP(x)+Y.x+Z=xHP(x)+Y.x+ZxHP(x)+Y.x+Z=1= \frac{\prod_{x \in H'} P(x) + Y.W(x) + Z}{\prod_{x \in H'} P(x) + Y.x + Z} = \frac{\prod_{x \in H'} P(x) + Y.x + Z}{\prod_{x \in H'} P(x) + Y.x + Z} = 1

There’s more to say about this, like why do we need to incorporate YY and ZZ into the mix? But honestly, I think that’s enough math for today.

In short, and to wrap things up: we use some IOPs to verify the constraints in an arithmetic circuit, which were encoded into polynomials!

Summary

Ooof. That was a lot. I know.

Ross from Friends in the high pitch scene
We know you’re not fine, Ross. It’s ok

Still, I find it amazing how all these things come together to form such a sophisticated protocol: we use polynomial interpolation to encode stuff, polynomial commitment schemes to query for information, interactive oracle proofs to create tests, and the Fiat-Shamir heuristic to turn all this mess into a non-interactive proof. Un-be-lie-va-ble.

The final result, Plonk, achieves the generality we were looking for by allowing any circuit to be used. And since this reveals nothing about xx, then this is really a zero-knowledge SNARK (zkSNARK)!

For the sake of completeness, I’ll say that there are some other details to be considered in order to make sure that the protocol is zero-knowledge, specifically around the challenges. Don’t worry too much though — unless, of course, you’re planning on implement Plonk yourself!

For a more interactive look, you can check this excellent video lesson.

And here’s another stab at explaining the protocol, in case you find it easier to follow!

Hopefully, you can now better understand the brief description of Plonk I provided at the very beginning of the article:

In Plonk, we encode the computation of a circuit into a series of polynomials, which represent the constraints imposed by the circuit, for which we can prove statements by using a series of tests, without ever revealing the polynomials themselves— and thus, not leaking the secret inputs.

Given that we can now craft proofs for arbitrary circuits, I’d like to get more practical. So, next time, we’ll be building a couple circuits for some statements, which can then be the inputs for Plonk, turning them into zkSNARKS. See you soon!