Arkana Logoarkana

Cryptography 101 Asides: RSA Explained

CryptographyRSAMathematicsModular Arithmetic
F
Frank Mangone
March 31, 2024 · 6 min read · Medium

This is part of a larger series of articles about cryptography. If this is the first article you come across, I strongly recommend starting from the beginning of the series.

RSA encryption is one of the most widely used encryption algorithms out there — and it relies solely on the additive group of integers modulo nn. It's a great example of how much can be done without the need for elliptic curves, or any other more complex constructions.

This aside is dedicated to explaining the working principles and nuances of the RSA algorithm.

The Subjacent Problem

A lot of cryptographic mechanisms operate on the principle that it's really hard to perform a certain operation unless you know some secret key. In encryption, that's exactly the idea: an encrypted message is undecipherable without knowledge of said secret key.

How is this achieved, then? By crafting a problem that's really difficult to solve, unless you have some extra, secret information. RSA hinges on one of these problems: the factorization of large numbers into their prime factors. This is: given a (large) number nn, expressing it as:

n=p.qn = p.q

Where pp and qq are large prime numbers. If you know pp and qq, calculating nn is trivial — and so is calculating pp if you know nn and qq.

How Big is Big Enough?

Of course, this is all worthless if the problem is easy to solve. For example, if n=7×11=77n = 7×11 = 77, then factorization really only takes an instant. As the prime numbers become larger though, factorization starts taking more and more time.

The recommended size for the prime factors pp and qq is between 10241024 and 20482048 bits. For reference, this is how a 1024-bit prime number looks like:

170154366828665079503315635359566390626153860097410117673698414542663355444709893966571750073322692712277666971313348160841835991041384679700511912064982526249529596585220499141442747333138443745082395711957231040341599508490720584345044145678716964326909852653412051765274781142172235546768485104821112642811

Yeah, it's pretty big.

Estimating how long it would take to find the prime factors of a large number say, 2048-bit long, is hard. Mainly because it hasn't really been done!

But still, it's theorized that it would take anywhere from hundreds to thousands of years, even with powerful hardware. Unless Quantum Computing becomes available, that is — and that's a story for another time.

Preliminaries

How do we use the previous problem to encrypt data? Doing so will require a couple definitions. In particular, we'll need to know about Euler's totient function, and its properties.

Euler's totient function

For any natural number nn, this function (denoted as φ(n)\varphi(n)) counts how many natural numbers lower than nn (not counting 0) are coprime to it.

Being coprime means that the numbers share no prime factors. Another way to say this is that the greatest common denominator of coprime numbers is 11.

For example, 66 and 2525 are coprime.

Notice that for a prime number pp, every number lower than it is a coprime (because since pp is prime, it's only prime factor is pp!). So we can write:

φ(p)=p1\varphi(p) = p - 1

And also, it is true that for n=p.qn = p.q, if pp and qq are prime, then:

φ(n)=(p1)(q1)\varphi(n) = (p - 1)(q - 1)

An Important Property

Why do we care about the totient function? Mostly because of Euler's theorem, which states that if aa and nn are coprime, then:

aφ(n) mod n=1a^{\varphi(n)} \ \textrm{mod} \ n = 1

And if we multiply both sides by aa, this is what we get:

aφ(n)+1 mod n=aa^{\varphi(n) + 1} \ \textrm{mod} \ n = a

Apparently, there's a certain magical number φ(n)+1\varphi(n) + 1 which seems to allow us to recover the original value aa!

Batman thinking
I think I know where this is going...

The Algorithm

If we substitute aa in the previous equation by our message mm, then we have a great primer for an encryption mechanism, because we have an operation on mm that allows us to recover mm!

What we need to do, is split the process into two steps. And to do this, we simply factorize φ(n)+1\varphi(n) + 1.

This is not your usual factorization, though. What really happens is that we choose some random large number ee, provided that e<φ(n)e < \varphi(n), and that ee is coprime with φ(n)\varphi(n). Then, we calculate another number dd such that:

e.d mod φ(n)=1e.d \ \textrm{mod} \ \varphi(n) = 1

This is really just the modular multiplicative inverse of ee, modulo φ(n)\varphi(n).

And calculating dd in such a way ensures that the following is true (which is fairly simple to prove, so I leave that task to you):

me.d mod n=mm^{e.d} \ \textrm{mod} \ n = m

And the interesting part is, with knowledge of ee and nn, it's not easy to calculate φ(n)\varphi(n) because for that, you'd need the prime factors of nn! Which is a really hard problem! For this reason, ee can be made public — and will indeed be the public key in RSA.

The Steps

All that remains is to separate into two steps. We use the public key ee to calculate a ciphertext:

me mod n=cm^e \ \textrm{mod} \ n = c

Without knowledge of dd, this cannot be converted back into mm. So as long as dd is kept secret, then only whoever holds this value can decrypt cc. And because of this, dd is going to be our private key.

To decrypt, we simply do the following:

cdmod n=me.dmod n=mc^d \textrm{mod} \ n = m^{e.d} \textrm{mod} \ n = m

And voilà! The message is decrypted!

Note that you can also digitally sign with this scheme, by using dd to produce a signature, and ee to verify it. Neat, huh?

Summary

And there you have it! That's how RSA encryption works.

Dumbledore slow claps
Dumbledore approves

There's not much more to say about it. Still, there are a couple points we haven't touched on — namely, how to calculate modular multiplicative inverses, or how to generate large prime numbers. We won't cover them here, but of course, these are key components for RSA encryption as well.

Nevertheless, there are some particularly sensitive points in RSA that can become very big pitfalls for the entire cryptosystem, as is fantastically explained here. The theory is really fine and dandy, but implementing this seemingly simple scheme on your own can make it very vulnerable.

Dumbledore slightly concerned
Shocked

This is one of the reasons why RSA is mostly considered obsolete, and has been replaced by Elliptic Curve Cryptography (ECC) for a lot of applications.