Arkana Logoarkana

Cryptography 101: Threshold Signatures

CryptographyPolynomialsThreshold SignatureInterpolationMathematics
F
Frank Mangone
April 30, 2024 · 14 min read · Medium

This is part of a larger series of articles about cryptography. If this is the first article you come across, I strongly recommend starting from the beginning of the series.

In the previous article, we learned about polynomials and saw a couple of their applications.

However, this new piece of cryptography is seemingly disjoint from everything we've learned so far. Nevertheless, there are cool ways to combine them with previous concepts, such as digital signatures.

This article will be solely dedicated to explaining one such example, where we'll combine the usual signing techniques with polynomials, to create an interesting hybrid scheme. We'll be working in the context of elliptic curves, and use GG and HH to denote group generators for a group G\mathbb{G}.

I'll be loosely basing my explanation on this paper, with some adjustments in the hopes of making things easier to navigate.

Still, a friendly disclaimer: threshold signatures are not simple to explain. There are a lot of nuances that we'll need to clarify. So let's first look at a brief summary of what they are and what they do, before really diving into the mathematics behind them.

Just Enough Signers

Threshold signatures are a type of multisignature — meaning that multiple participants are required to sign a message. But this time, it's not necessary for every participant to do so.

Imagine a group of ten people who have admin privileges for an application. To perform some sensitive action, we require at least three approvals.

This way, we don't have to bother all of the admins at the same time; just a subset of available signers will be enough.

This feels like an upgrade to the multisignature scheme we explored previously, where all signers were required to participate. But in reality, achieving this result involves flexing some more cryptographic muscles.

Schematics showing the high level idea of threshold signatures
Click to zoom
A threshold signature where at least 3 out of 4 signers are required

We can divide threshold signatures into three major steps or algorithms (just like most signature schemes):

  • Key generation (KeyGen), which is an algorithm that outputs a shared private and public key pair,
  • Signing, where a message is processed along with the private key and a nonce to obtain a pair (r,s)(r, s),
  • And verification, the step where the signature is validated against the message, and the public key.

When working with threshold signature schemes, the key generation and signing steps, which were rather straightforward in past examples, will be replaced by more complex processes involving communications between the signers. Luckily, verification remains the same — so our focus will lie on the first two steps.

Notice that the idea of requiring a threshold of signers is very reminiscent of the minimum amount of points to reconstruct a polynomial via interpolation. And in fact, this is at the core of how threshold signatures work.

Also, in order to keep things clear, we must say that the signers or participants are ordered. Each one will have an identifier or index, ranging from 11 to nn, the total number of participants.

Our goals are set, and the introduction is over. Shall we get to the good stuff?

GTA San Andreas 'Ah shit, here we go again' meme

Preliminaries

In threshold protocols, sharing information is key. Ultimately, the ability of a set of signers to share information is what will enable them to produce signatures.

We've already seen that sharing a secret can be achieved with Shamir's Secret Sharing (SSS). The idea was to evaluate a polynomial at many different values, and to share the results as points. And with those points, we could interpolate back the original polynomial.

But there's a catch. How can any receiver check whether if the values they receive are correctly calculated? Or, in other words, is there a way to prove that the points are effectively related to the original polynomial?

And you may be wondering why would the values be incorrect? Why would anyone send a wrong value? There are at least two reasons to consider: errors in communication, and malicious activity. It's possible that an attacker could be trying to break our signature model — we can't necessarily expect everyone to behave correctly, and we must take action to mitigate this possible scenario.

To address this concern, we'll require a new tool.

Verifiable Random Secret Sharing

What we do is ask the sharer to make a commitment. This will bind the sharer to their secret information (their polynomial), so that later they just can't produce invalid values.

This idea is what's behind Verifiable Random Secret Sharing (VRSS), sort of a drop-in replacement for SSS. What we want to do is commit to the coefficients of our polynomial — and for this, we'll require not one, but two of them:

fi(x)=ai,0+ai,1x+ai,2x2+...+ai,txtf_i(x) = a_{i,0} + a_{i,1}x + a_{i,2}x^2 + ... + a_{i,t}x^t fi(x)=bi,0+bi,1x+bi,2x2+...+bi,txtf_i'(x) = b_{i,0} + b_{i,1}x + b_{i,2}x^2 + ... + b_{i,t}x^t

Why, you ask? Because commitments need to be hiding. We don't want to reveal the coefficients, so their commitments should be blinded. The coefficients of the second polynomial are in fact these blinding factors!

So by using our group generators, each participant ii calculates and broadcasts the commitment values CiC_i, one for each of the coefficients in the polynomials:

Ci,m=[ai,m]G+[bi,m]HC_{i,m} = [a_{i,m}]G + [b_{i,m}]H

Cool! Now all that remains is sharing. And for this, everyone will need to evaluate their polynomial. Since every participant has an index jj, we can make the choice of evaluating the share for a target player at their index, so fi(j)f_i(j).

What this means is that individuals will get fi(j)f_i(j) and fi(j)f_i'(j) from some other participant ii.

VRSS diagram

And upon receiving these values, participant jj can validate them as follows:

[fi(j)]G+[fi(j)]H=m=1t1[(j)m]Ci,m[f_i(j)]G + [f_i'(j)]H = \sum_{m=1}^{t-1} [(j)^m]C_{i,m}

That's just about it! We now have a mechanism to share secret information, in a way that's verifiable. With this tool, we can start building our signature scheme.

Key Generation

Since there are multiple parties involved in threshold signatures, each participant will hold an integer did_i as their share (or piece) of a private key.

However, this is not a randomly chosen integer, as per usual. Instead, the participants engage in a process where they interact with each other, ultimately producing their share of the private key. These kind of protocols are called Distributed Key Generation (DKG) algorithms.

And we can use VRSS to build our algorithm. How convenient!

Key generation procedure visualization
Construction of the share of the private key

The idea is that each participant will receive a share from each of their peers, and they will use these verified values to calculate their private key share:

di=j=1mfj(i)d_i = \sum_{j=1}^m f_j(i)

It's possible that some values do not pass verification; some parties may be disqualified in this case. This is why VRSS is important.

Still, we'll go with the happy path to keep things relatively simple.

The output of the DKG is a piece of a shared private key, dd. None of the parties involved in this protocol know this value — only its pieces.

Interpolation of key shares
Click to zoom
If we were to interpolate the private key shares, we'd get the shared private key

Finally, we need a public key. And for this, each participant broadcasts their secret independent or zeroth coefficient, ai,0a_{i,0}. This secret cannot be disclosed as such, and therefore, it is transmitted as a public key share:

Qi=[ai,0]GQ_i = [a_{i,0}]G

I think it's fairly odd to see the public key share being calculated like this. I bet you expected to see did_i in there!

There's good reason for it not being used, though. We'll return to this statement later, because we'll need to define a couple things in order to understand what's really happening.

Spiderman making an OK sign

Once all the shares are public, then every participant can calculate the global public key independently:

Q=i=1mQiQ = \sum_{i=1}^m Q_i

To wrap things up, here's a brief summary for this step:

Key generation involves parties communicating with each other, generating pieces of a shared private key. No single player knows the shared private key. A public key that is associated with the shared secret is also calculated.

With all the keys in place, all that remains is to sign.

Signing a Message

The first thing that we'll need is a message to sign. This is not trivial though, because everyone needs to agree on a message. We'll not cover how this happens — let's just assume everybody knows the message MM being signed.

In ECDSA, a signer would typically choose a random nonce kk, and calculate a challenge R=[k]GR = [k]G accordingly, producing a signature like this:

s=k1(H(M)+d.r) mod ns = k^{-1}(H(M) + d.r) \ \textrm{mod} \ n

But as we've already seen, this is not how threshold cryptography tends to operate. Instead, a group of tt signers will communicate with each other in order to produce a signature. And the first thing they'll need, is a nonce.

Luckily, we already have a tool to generate a distributed value: DKG! Let's just say that signers execute a round of DKG, obtaining a share kik_i, and an associated challenge:

Ri=i=0mRiR_i = \sum_{i=0}^m R_i

For the computation of the signature, everyone will use the x-coordinate of RR, which we'll denote rr.

Building the Signature

As you can probably guess by now, signature generation is also done in shares. And again, the idea is that only when a certain number of shares is available, we'll be able to produce a valid signature through the aggregation of these independently calculated parts.

Our requirement is the following: the signature shares sis_i should interpolate to the final signature ss — which should pass verification when using the public key QQ. The most natural thing to do here, is to adapt the expression for ss so that it uses shares of its components instead:

si=ki1H(M)+ki1di.r mod ns_i = k_i^{-1}H(M) + k_i^{-1}d_i.r \ \textrm{mod} \ n

But does this make sense? Here, we have an addition, multiplications, and even modular inverses. It doesn't feel obvious to assume that this will work just like that.

It seems only fair that we examine this expression and check that it works properly. And really, it's not as complicated as you'd imagine. Let's take it slow, one step at a time.

'Lie down, try not to cry, cry a lot' meme
Trust me, it's not that bad!

Interpolating Additions

To start us off, let's say that we have two polynomials a(x)a(x) and b(x)b(x). If we evaluate them at different values ii, we get sets of points of the form (i,a(i))(i, a(i)) and (i,b(i))(i,b(i)). For convenience, let's just denote them aia_i and bib_i:

ai=(i,a(i))bi=(i,b(i))a_i = (i, a(i)) b_i = (i, b(i))

We know that any subset of t points of these points allows us to interpolate back the original polynomials. And if we define the independent term of a(x)a(x) to be aa, we could write that:

ainterpolate(a1,...,at)a \leftarrow \textrm{interpolate}(a_1, ..., a_t)

As a reminder, in the context of secret sharing, we're usually interested in the independent term. This is why when we say that we interpolate some values, we may refer to the output as just this independent or zeroth coefficient, and not to the entire polynomial. But really, the full output is the entire polynomial!

Similarly, let's assume we have points b(x)b(x) has independent term bb, and so:

binterpolate(b1,...,bt)b \leftarrow \textrm{interpolate}(b_1, ..., b_t)

Now,what happens if we add the polynomials a(x)a(x) and b(x)b(x)?

Diagram representing the addition of two polynomials
Click to zoom

We can add term by term, and we end up with a polynomial with the same degree as the originals (t1t - 1), but where the independent term is g=a+bg = a + b. Also, since g(x)=a(x)+b(x)g(x) = a(x) + b(x), then all the points that interpolate to gg, which are (i,g(i))(i, g(i)), can be calculated as ai+bia_i + b_i. And so:

ginterpolate(a1+b1,...,at+bt)g \leftarrow \textrm{interpolate}(a_1 + b_1, ..., a_t + b_t)

And that's awesome! This means that we can essentially add shares of a polynomial, and when interpolating, we'll get as the result the sum of the shared secrets. Neat!

Now we can analyze the public key calculation. Remember that the share did_i is calculated as the sum of the fj(i)f_j(i) values.

Because of this, did_i is essentially a share of a summation of polynomials, whose independent term will be the sum of all the ai,0a_{i,0} terms. Which means, the result of interpolating all the did_i will yield that summation!

Diagram with the derivation of the public
Click to zoom

And that's the reason why the public key is calculated the way it is. Everything checks out!

i=0mai,0interpolate(d1,...,dm)\sum_{i=0}^m a_{i,0} \leftarrow \textrm{interpolate}(d_1, ..., d_m) Q=[i=0mai,0]G=i=0m[ai,0]GQ = \left [ \sum_{i=0}^m a_{i,0} \right ]G = \sum_{i=0}^m [a_{i,0}]G

Interpolating Products

Products are slightly trickier. When multiplying a(x)a(x) and b(x)b(x), the resulting polynomial h(x)h(x) will have double the degree. And because of this, we need twice as many points for interpolation:

hinterpolate(a1.b1,...,a2t1.b2t1)h \leftarrow \textrm{interpolate}(a_1.b_1, ..., a_{2t-1}.b_{2t-1})

And this is not really optimal, because now we need more values to interpolate, which means more communications between peers.

Regardless of this inconvenience, the good news is that this behaves the way we expect it to: when multiplying h(x)=a(x)b(x)h(x) = a(x)b(x), the independent terms are also multiplied, and our values aibia_ib_i will also interpolate to h=abh = ab!

Also worth mentioning: when multiplying shares by a constant, the resulting interpolation will also be multiplied by it:

Multiplication by constant
Click to zoom

So if we take our shares to be (i,k.ai)(i, k.a_i), then interpolation will yield k.ak.a. Pretty straightforward, right?

Mr Bubz meme
Mr Bubz ain't that happy about it

Alright, we only have one more case to analyze. The pain and suffering will end promptly, I promise.

Interpolating Inverses

Really, all we're missing is handling that wretched modular inverse. What we want is to produce values ki1{k_{i}}^{-1} that, when interpolated, yield k1k^{-1}. This is going to take a couple steps. Yeah.

  • First off, everyone will run a round of VRSS to produce shares αi\alpha_i. Of course, these shares interpolate like this:
αinterpolate(α1,...,αm)\alpha \leftarrow \textrm{interpolate}(\alpha_1, ..., \alpha_m)
  • Next, every participant will calculate and broadcast:
μi=αi.ki\mu_i = \alpha_i.k_i
  • Since μi\mu_i is the result of a multiplication, upon receiving 2t12t-1 shares, anyone could interpolate this value:
μinterpolate(μ1,...,μ2t1)\mu \leftarrow \textrm{interpolate}(\mu_1, ..., \mu_{2t-1})
  • Finally, each participant calculates ki1{k_{i}}^{-1} in this fashion:
ki1=μ1.αi{k_{i}}^{-1} = \mu^{-1}.\alpha_i

How does this wizardry work, you ask? Well, consider this: μ1μ^{-1} acts as a constant term when interpolating the ki1{k_{i}}^{-1} values. And because of this, we end up with:

μ1.αinterpolate(k11,...,k2t11)\mu^{-1}.\alpha \leftarrow \textrm{interpolate}({k_1}^{-1}, ..., {k_{2t-1}}^{-1}) μ1.α=k1.α1.α=k1\mu^{-1}.\alpha = k^{-1}.\alpha^{-1}.\alpha = k^{-1}

And just like magic, we have built values that interpolate to the inverse of kk!

Hagrid from Harry Potter
You're a wizard now, Harry

That's all we're going to need! Let's check back on our signature calculation with our freshly-found conclusions.

Back to Signing

If we could reconstruct every shared secret, then the signature calculation would happen as in standard ECDSA:

s=k1H(M)+k1dr mod ns = k^{-1}H(M) + k^{-1}dr \ \textrm{mod} \ n

But in practice, we don't want that to happen — and we only have shares. And so we asked ourselves whether this:

si=ki1H(M)+ki1dir mod ns_i = {k_i}^{-1}H(M) + {k_i}^{-1}d_ir \ \textrm{mod} \ n

also interpolated to ss. And the answer is a resounding yes, because we're only dealing with additions, products, and inverses — and we already know how these behave.

Perhaps the only problem here is that since we're dealing with a product of shares (the ki1dir{k_i}^{-1}d_ir term), we'll require like 3t23t-2 shares to interpolate. But leaving that aside, we're sure that interpolating the sis_i values will yield the expected value of ss!

Different protocols may make use of various techniques to try and mitigate the need for extra points for interpolation — and ideally, we'd want to keep that number as close to tt as possible. Also, the fewer communication steps are needed, the better.

To wrap things up, when each participant calculates their share sis_i, they simply broadcast it. And when enough shares are available, anyone can interpolate, produce, and output ss.

And there you have it! Simple, right?

Obi-Wan Kenobi, confused

I'm joking, of course — this is anything but simple. But the general idea is what's really the key takeaway:

During signing, parties again communicate with each other, generating pieces of a shared signature. When enough pieces are available, the final signature can be constructed; with less pieces than needed, it's simply not possible.

Verification happens as usual, because the output of the signature is simply the pair (r,s)(r, s).

Summary

I reckon this is the most technically-packed article I've written so far. I tried my best to keep it simple, but there are some things that we just can't avoid explaining. At the very least, I hope this sheds some light on some aspects that, from my experience, are not usually explained in detail.

🔥 Important: There's actually a pretty big vulnerability in the process I described, where private key shares are leaked when sharing sis_i.

This is addressed in the paper I used as a guide, and the solution is actually quite simple. So please, don't go using this article to construct your threshold signatures — and maybe refer to the actual paper instead!

Designing these kind of Multi Party Computation protocols, where there's communication between parties, requires considering that malicious activity may exist. So in general, these protocols are filled with disqualification rounds, proofs of correctness, maybe even some encryption, and other fun stuff. Really, they are a consolidation of many different techniques, and tend to be complex.

Multi Party Computation is, in general, quite complex.

This is really the simplest scheme I could find in terms of what tools are needed, and is presented mostly in the hopes of illustrating the main components of these techniques. And this means that the more tools we can accrue, the easier it will be to craft more complex schemes.

Having said that, our journey will continue by presenting yet another extremely powerful tool, that will unlock new types of functionality: pairings. See you on the next one!