1. Introduction
Shor’s algorithm [1] for prime factorization is a major breakthrough in quantum computing: It allows to break RSA-based [2] as well as elliptic curve-based cryptosystems exponentially faster than the best-known classical algorithms. Thus, it is a severe threat to the established security infrastructure. However, the quantum resources required to run Shor’s algorithm (and thus to break today’s security) have been considered at that time to be available in some decades only (if at all). This resulted in awareness of the problem itself but positioning it as a problem that must be dealt with in the far future. These days it seems likely that the required quantum resources will be available in the next few years. Thus, today’s security infrastructure is in jeopardy forcing policy makers and industry to urgently focus on this problem.
Prime factorization can be reduced to the period finding of the module exponential function (section 3.1). Quantum computers are able to fast determine global properties of functions like their periods [3], which is exploited in Shor’s algorithm by using the quantum Fourier transformation (section 3.2). Effectively, this way discrete logarithms are computed (section 2.5). Computing discrete logarithms also allows the computation of private keys used in elliptic curve cryptography (section 2.6). Thus, a security scheme is needed that cannot be broken with discrete logarithms and that has not been cracked (yet).
Lattice-based cryptography is such a scheme. A lattice is the set of all integer combinations of linear independent vectors (section 4.1) or, equivalently, the set of all integer solutions of linear equation systems. While linear equations can be solved efficiently with classical methods, adding a small random error term to the equation system results in a hard problem (section 4.8) referred to as Learning with Errors. It is this problem that is the underpinning of algebraic lattice cryptography (chapter 5). Furthermore, instead of solving the Learning with Errors problem in vector spaces, modules over rings of residue classes of polynomials are used (section 5.3).
Based on these structures, an asymmetric encryption/decryption mechanism can be defined (section 5.4) as well as a means for signatures (section 5.5). These two mechanisms are recognized to be quantum-safe, i.e. neither classical algorithms nor quantum algorithms are known that can break these security schemes. Because of this, they are standardized as the basis for a future quantum-safe infrastructure (section 6.4).
The importance of becoming quantum-safe has been recognized by several government organizations, who initiated efforts to move today’s infrastructure to a post-quantum secure one (section 6.3). This way, vital data like health data, military secrets, etc should be kept confidential. To protect confidential data, companies must become active too, and the time needed for corresponding efforts can be estimated by Mosca’s inequation (section 6.2). The corresponding transition is backed-up by open-source software implementing the standardized post-quantum security algorithms (chapter 7).
1.1. Structure of the article
The article is structured as follows: in the first part, we give an overview of fundamental concepts and algorithms underlying today’s security infrastructure. The second part sketches roughly a few actions of policy makers. While the first part (sections 2 to 5) contains timeless information, the much shorter second part (sections 6 and 7) is somehow volatile presenting information that might be outdated in a few years and that has to be digested by the reader with corresponding care.
2. Classical asymmetric cryptography
In this section, we first remind a few definitions and facts from number theory related to modular arithmetic [4]. Next, we provide the basics of RSA and elliptic curves which are the underpinnings of today’s asymmetric cryptography [5]. Especially, their relation to discrete logarithms is presented.
2.1. Modular arithmetic
A couple of prominent algorithms in cryptography like RSA (section 2.4) or DiffieHellman (section 2.6) are based on modular arithmetic. In this section, we sketch its basics.
For x ∈ ℝ let
denote the greatest integer less than or equal to x (also called the floor of x). Then, for a ∈ ℤ and n ∈ ℕ (i.e. especially n ≠ 0), the remainder of a divided by n is given by
and this remainder is denoted by “a mod n” (pronounced “a modulo n”). This is captured by the following
Definition 1: The function
mod: ℤ × ℕ+ → ℕ0
is called modulo function.
The definition directly implies the following note:
Note 1: Let a ∈ ℤ and n ∈ ℕ+. Then:
r = a mod n ⇔∃ k ∈ ℤ : a = kn + r ∧ 0 ≤ r < n (2)
In case a mod n = b mod n, a and b give the same remainder when being divided by n. This is an important relation between the two numbers a and b that deserves a separate definition:
Definition 2: Let a, b ∈ℤ and n ∈ℕ+. Then:
a ≡ b (mod n) :⇔ a mod n = b mod n (3)
In this case, a and b are called congruent modulo n.
Congruence “≡” is an equivalence relation, i.e. it is reflexive, symmetric, and transitive [4]. Note that while “a ≡ b (mod n)” is a Boolean statement (i.e. it is true or false for given a, b, n), “a mod n” is a natural number (i.e. the remainder of a divided by n).
Definition 3: Let m ∈ℤ and n ∈ℕ+. The notation n | m is defined as follows:
n | m: ⇔∃ k ∈ℤ : m = kn (4)
n is called a divisor of m (aka n divides m aka n is a factor of m). The notation n ≠ m means that n does not divide m. In case n ≠ 1 and n ≠ m, n is called a proper divisor of m.
The next note follows immediately from the definitions before:
Note 2:
a ≡ b (mod n) ⇔ n | (a − b) (5)
Proving the congruence of two numbers (especially of large numbers) can be very tedious. The following often simplifies related computations:
Principle (Reduce First, Then Compute): Assume that (a1a2 + a3a4) mod n has to be computed, and let ai ≡ bi mod n for 1 ≤ i ≤ 4. Then:
(a1a2 + a3a4) mod n = (b1b2 + b3b4) mod n
The proof follows directly from the definitions.
This principle allows us to reduce an arithmetic expression that might be cumbersome to compute (in our case a1a2 + a3a4) first and then use an arithmetic expression that might be easier to compute (in our case b1b2 + b3b4) to determine the remainder divided by n.
Example [4]: (39 . 99 + 95 . 64) mod 19 should be computed. Now, it is 39 ≡ 1 mod 19, 99 ≡ 4 mod 19, 95 ≡ 0 mod 19, and 64 ≡ 7 mod 19. Thus, we can substitute 39 by 1, 99 by 4, 95 by 0, and 67 by 7 and result in
(39 ⋅ 99 + 95 ⋅ 64) ≡ (1 ⋅ 4 + 0 ⋅ 7) ≡ 4 mod 19 = 4
which is much simpler than computing (39 ⋅ 99 + 95 ⋅ 64) first and then finding the remainder divided by 19.
Another example is computing 1274 mod 5. It is 127 ≡ 2 mod 5. Thus, it is
1274 ≡ 127 ⋅ 127 ⋅ 127 ⋅ 127 ≡ 2 ⋅ 2 ⋅ 2 ⋅ 2 ≡ 16 mod 5 = 1
In our final example, we want to compute 129537 mod 9. This can be done by using a decimal polynomial representation of 129537 and observing 10 ≡ 1 mod 9:
129537 = 1 ⋅ 104 + 2 ⋅ 104 + 9 ⋅ 103 + 5 ⋅ 102 + 3 ⋅ 101 + 7 ⋅ 100
≡ 1 ⋅ 15 + 2 ⋅ 14 + 9 ⋅ 13 + 5 ⋅ 12 + 3 ⋅ 11 + 7 ⋅ 10
≡ 1 + 2 + 9 + 5 + 3 + 7
≡ 27
≡ 0 mod 9
= 0
Thus, the principle “reduce first then compute” is a fundamental principle in modular arithmetic to make computations easier.
2.2. Diophantine equations
Several algorithms in cryptography require solving linear Diophantine equations, i.e. equations of the form
ax + by = c (6)
with given a, b, c ∈ ℤ and where x, y ∈ ℤ (i.e. the solution of the equation) have to be determined. Deciding whether or not a linear Diophantine equation has a solution requires determining the greatest common divisors.
Definition 4: n ∈ ℕ+ is called a common divisor of a, b ∈ ℤ iff both, n | a and n | b. The greatest common divisor of a and b is denoted by gcd(a, b). a and b are called coprime iff gcd(a, b) = 1.
The next Theorem provides a criterion for the solvability of a linear Diophantine equation.
Theorem 1: Let a, b, c ∈ ℕ. Then: (∃ x, z ∈ ℤ : ax + by = c) ⇔ gcd(a, b) ∣ c
Proof: [4].
The famous Euclidian Algorithm that is the basis for computing solutions of linear Diophantine equations is given by
Theorem 2: Let a, b ∈ ℕ. Define r0 := a, r1 := b and ri+1 := ri−1 mod ri for i ≥ 1. Then there exists an N ∈ ℕ such that rN = 0 and rN−1 = gcd(a, b).
Proof: [4].
The (extended) Euclidian Algorithm is often applied by using a certain schema when writing down the computation steps (see [4] section 3.2 for the details).
As an example, we apply this schema to compute the greatest common divisor gcd(63,17) of the numbers 63 and 27 (the numbering of the equations in needed in the next example):
63 = 3 ⋅ 17 + 12 (d)
17 = 1 ⋅ 12 + 5 (c)
12 = 2 ⋅ 5 + 2 (b)
5 = 2 ⋅ 2 + 1 (a)
2 = 2 ⋅ 1 + 0
Thus, gcd(63,17) = 1, i.e. 63 and 17 are coprime.
Within Theorem 1, the direction “⇐” is called Bézout’s Identity. Because of its importance, it is stated as a separate lemma.
Lemma 1: Let
a,
b ∈ ℕ and let
g = gcd(
a,
b). There exist
x,
y ∈ ℤ such that
g = ax + by (7)
The factors x and y are called Bézout coefficients for (a, b) and can be determined based on the Euclidian algorithm. For this purpose, the equations of the computations performed according to the Euclidian algorithm are applied “from bottom to top” beginning with the last equation, i.e. the one containing the gcd (see [4] section 3.2 for the details).
As an example, we use the computation of gcd (63,17) before: Using equation (a) results in
Based on equation (b), it is
, which is substituted in the equation before (and so on):
Thus, the Bézout coefficients for (63, 17) are (−7, 26).
This proceeding can be used to solve an equation important for RSA (section 2.4 (5)) following the proof of
Note 3: Let a, m ∈ ℕ be coprime, i.e. gcd(a, m) = 1. Then, there exists an x ∈ ℤ such that ax ≡ 1 mod m.
Proof: Because of lemma 1, there exist x, y ∈ ℤ such that 1 = ax + my. This implies that ax − 1 = − my. According to definition 3, this means m | (ax − 1). With note 2 we get ax ≡ 1 mod m. ∎
2.3. Factorization
One of the key algorithms in cryptography is based on the factorization of numbers (see section 2.4). Because of this, we are reminded of the basics of factorization. Factorization means to determine the composition of an integer of prime numbers.
Definition 5: A number p ∈ ℕ with no proper divisor is called a prime number. The set of all prime numbers is denoted by ℙ.
Every natural number can be represented as a product of prime numbers, and this representation is unique up to the order of the prime numbers in the product: this is the famous Fundamental Theorem of Arithmetic. Determining the prime numbers of this representation of a number is referred to as its factorization.
Theorem 3: Let n ∈ ℕ with n > 1. There exist p1 ,…, pk ∈ ℙ (pi ≠ pj for i ≠ j) and n1,…, nk ∈ ℕ such that
There are many known classical algorithms to compute the factorization of a given number. None of these known algorithms is efficient in the sense that it can compute the factorization in polynomial time. But is has not been proven that a polynomial algorithm cannot exist, nor has it been proven that such an algorithm does exist. Thus, it is open whether or not factorization can be done efficiently. In what follows, we will make use of the Lemma of Euclid.
Lemma 2: Let p ∈ ℙand a, b ∈ ℕ. If p | abthen p | a orp | b.
Assume n ∈ ℕ with n | ab and let
the factorization of n. Because
⋅ pi for a suitable m ∈ ℕ, it is pi | ab for each 1 ≤ i ≤ k. According to Euclid’s lemma before, it is pi | a or pi | b for 1 ≤ i ≤ k. Thus, each such pi is a common divisor of n as well as of a or b. This proves the following corollary:
Corollary 1: Let n ∈ ℕ with n | ab. Then: ∃ m ∈ ℕ : m | n ∧ (m | a ∨ m | b) .
Especially, if n is not prime, m is a proper divisor of n.
Note, that in general n | ab does not imply that n | a or n | b. For example: 6 | 12 = 3 ⋅ 4 but neither 6 | 3 nor 6 | 4; but of course, 3 is a common divisor of 6 and 3, and 2 is a common divisor of 6 and 4.
2.4. Rivest–Shamir–Adleman cryptosystem
Rivest–Shamir–Adleman (RSA) described a public-key cryptosystem that is based on the assumed hardness of factorization of natural numbers [2]. Public and private keys are determined as follows:
- Choose two large prime numbers p, q ∈ ℙ
- Compute n = pq
- Compute φ(n) = (p − 1)(q − 1)
- Choose a small g ∈ ℕ, with φ(n) and g being coprime (i.e. gcd(φ(n), g) = 1)
- Determine the solution d of the equation dg ≡ 1 mod φ(n) (see Note 3)
- Destroy p, q,φ(n)
(d, n) is the public key, and g is the private key.
The following examples are from [2]. Choose the prime number p = 47 and q = 59 (which are obviously not “large” but fine for the example). Then n = p ⋅ q = 2773 and φ(n) = 46 ⋅ 58 = 2668. The private key is chosen as g = 157 which is valid because gcd(2668,157) = 1, i.e. g and φ(n) are coprime. Next, the equation d ⋅ 157 ≡ 1 mod 2668 is solved according to Note 3 by computing the Bézout coefficients for (2668,157) as exemplified in the example after Lemma 1: the solution is d = 17. Thus, (17,2773) is the public key.
A message μ is encrypted as follows:
- μ is mapped to a natural number m
- E.g. each letter is substituted by a two-digit number like Blank=00, A=01, B=02, etc., and the numbers corresponding to the letters of the message are concatenated.
- The resulting number m must be less than n. If m ≥ n, the digits of m are split into blocks m1 ,…, mk such that mi < n.
- Each block is encrypted as
- The encrypted message is the concatenation
Example: Let μ = ITS ALL GREEK TO ME. By mapping this message to a natural number as sketched in (1a) and considering (1b) we get m = 0920 1900 0112 1200 0718 0505 1100 2015 0013 0500, i.e. m has been split into blocks of digits the corresponding numbers are less than n = 2773. The first block corresponds to the number 920, thus, according to (2) the encryption of this block results in
mod 2773 = 948; the second block corresponds to 1900, i.e.
mod 2773 = 2342, etc. Finally, the encrypted message is according to (3) the concatenation of the encrypted blocks:
Decryption is achieved by
4. decrypting each block
of the encrypted message m separately
5. by computing mi = (
)g mod n
6. and concatenating the decrypted block resulting in the decrypted message m = m1⋯mk
Example: In the encrypted message
from above, it is
, i.e. according to (5) it is m1 = (
)157 mod 2773 = 984157 mod 2773 = 920. With (1a) 920 corresponds to IT (i.e. 09 = I, 20 = T) and so on. Concatenating the decrypted block results in the original message.
Cracking a private key means to compute g. While g is secret, i.e. unknown, d as part of the public key is known. To compute g, the equation dg ≡ 1 mod φ(n) must be solved which requires in addition to d to know φ(n). This, in turn, means that φ(n) = (p − 1) ⋅ (q − 1) must be known. Because the RSA procedure is well-known, it is further known that n is the product of two prime numbers. Consequently, n must “just” be factorized to get φ(n) and then solve dg ≡ 1 mod φ(n). Since factorization is assumed to be hard, RSA is considered hard to crack, i.e. RSA is considered to be secure (ignoring Shor’s algorithm for now).
2.5. Modular exponential function and discrete logarithm
Factorization is closely related to the following function and its inverse, respectively:
Definition 6: For a given n ∈ ℕ choose an arbitrary number a ∈ ℕ with 0 < a < n.
The function
expa : ℕ0 → ℕ0 with expa(x) = ax mod n (9)
is called modular exponential function with basis a (and modulo n).
As an example, consider the first part of the value table of the modular exponential function with n = 5 and basis 2, i.e. :
x |
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
… |
exp2(x) |
1 |
2 |
4 |
3 |
1 |
2 |
4 |
3 |
… |
As can be seen, it is exp2(x + 4) = exp2(x). I.e. the function exp2( ⋅ ) with n = 5 seems to be periodic with period 4. These terms are defined next.
Definition 7: A function f : ℕ0 → ℕ0 is called periodic :⇔ ∃ t ∈ ℕ ∀ x ∈ ℕ0 :
f (x + t) = f (x). The smallest such number t is called the period of f.
The following is a well-known fact (a corollary of Fermat’s Little Theorem) about the periodicity of the modular exponential function:
Lemma 3: The modular exponential function expa(x) = ax mod n is periodic if and only if gcd(n, a) = 1.
Let gcd(n, a) = 1 , thus, expa(x) is periodic with period p ∈ ℕ . It is then expa(0) = expa(0 + p) = expa(p) which in turn is equivalent to ap mod n = a0 mod n = 1 mod n. By Definition 2, this is equivalent to ap ≡ 1 (mod n), and by Note 2 we know that n | (ap − 1), i.e. n is a divisor of ap − 1. This is summarized by
Note 3: Let a, n ∈ ℕ be coprime. expa(x) = ax mod n has period p ∈ ℕ if and only if ap ≡ 1 (mod n), i.e. n|(ap − 1). ∎
Let n | (ap − 1) and assume p ∈ℕ is even. Then, n | (ap/2 − 1)(ap/2 + 1), and according to Corollary 1, there is an m ∈ ℕ with m | nand m | (ap/2 − 1) or m | (ap/2 + 1). Thus, n and (ap/2 − 1) have a common divisor, or n and (ap/2 + 1) have a common divisor. This implies that gcd((ap/2 − 1), n) or gcd((ap/2 + 1), n) is a divisor of n.
This means that if the period of expa, 0 < a < n, can be determined, and the period is even, a divisor of n can easily be computed. And by iteration, the factorization of n results. Before discussing the frequency of this situation, we give the following
Definition 8: The smallest number y ∈ ℕ with ay ≡ x mod n (and 0 < a < n) defines a map
loga : ℕ0 → ℕ0 with x ↦ y (10)
called discrete logarithm (of x) with basis a (and modulo n).
As an example, it is 24 = 16 ≡ 5 mod 11, with implies log2 5 = 4 (modulo 11). The usual equations like expa(loga z) = z and loga(expa y) = y apply the discrete logarithm and modular exponential function.
Computing the period p of expa means to determine p ∈ ℕ0 with ap ≡ 1 (mod n) (see Note 3), i.e. to compute the discrete logarithm p = loga 1. This means that factorization - and, thus, cracking RSA - can be reduced to compute discrete logarithms! But computing discrete logarithms is assumed to be classically hard, just like factorization. Similarly, cracking cryptography based on elliptic curves can be reduced to computing discrete logarithms (see section 2.6).
Remember that the assumption we made in order to determine the period p ∈ ℕ of expa, 0 < a < n, is that p is even. Under this assumption, gcd((ap/2 − 1), n) or gcd((ap/2 + 1), n) is a divisor of n. First of all, the question is how often the period is even. Next, the question is whether gcd((ap/2 − 1), n) or gcd((ap/2 + 1), n) are proper divisors of n. The first observation is:
Note 4: gcd((ap/2 − 1), n) ≠ n.
Proof: Let gcd((ap/2 − 1), n) = x. Then, we find y, z ∈ ℕ with zx = ap/2 − 1 and y x = n. Thus, ap/2 − 1 = z x = z ⋅ n / y = n ⋅ z / y. Assume x = n, thus, y = 1. This means that ap/2 − 1 = nz ⇒ ap/2 = nz + 1 ⇒ ap/2 ≡ 1 mod n. According to Note 3, this means that the period of expa is p/2 - which is a contradiction. This proves the claim.∎
It may still be the case that gcd((ap/2 − 1), n) = 1, i.e. that gcd((ap/2 − 1), n) is not a proper divisor of n. In case gcd((ap/2 − 1), n) = 1, Corollary 1 implies that gcd((ap/2 + 1), n) ≠ 1. Still, it may be that gcd((ap/2 + 1), n) = n , and, thus, gcd((ap/2 + 1), n) may not be a proper divisor of n.
Consequently, the bad situation gcd((ap/2 − 1), n) = 1 and gcd((ap/2 + 1), n) = n may occur. However, according to Nielsen and Chuang (see [3], Theorem A4.13) the following is known:
Note 5: For more than half of the numbers 0 < a < n with gcd(n, a) = 1, the period p of expa is even, and gcd((ap/2 − 1), n) ≠ 1 or gcd((ap/2 + 1), n) ≠ n.∎
Thus, after a couple of attempts a number a will be found such that expa has an even period and that gcd((ap/2 − 1), n) or gcd((ap/2 + 1), n) is a proper divisor of n.
How the period of expa can be determined efficiently is the subject of section 3.
2.6. Elliptic curve cryptography
Elliptic curve cryptography can also be broken by efficient computations of discrete logarithms on quantum computers. In this section, we briefly introduce the underpinnings of elliptic curve cryptography and sketch its relationship to discrete logarithms. For more details about elliptic curves as well as the proofs of the facts given in this section [6].
Note, that an elliptic curve is not an ellipsis. However, the name results from computing the arc length of an ellipsis by means of integrals. With proper parametrization, the integrands of the corresponding integrals contain polynomial functions as used in the definition of the following set C:
Definition 9: Let C := {(x, y) ∈ ℝ2 ∣ y2 = x3 + ax + b}∪ {𝒩} and a, b ∈ ℝ with 4a3 + 27b2 ≠ 0. Then C is called an elliptic curve.
The reason why 𝒩 (“the infinitely far point”) is added will be clear soon. The condition 4a3 + 27b2 ≠ 0 excludes repeated roots that would result in singularities like intersections or cusps of the curve. Figure 1 depicts three elliptic curves to give an impression of their possible shapes.
Figure 1: Sample Shapes of Elliptic Curves.
An elliptic curve C is obviously symmetrical to the x-axis, i.e. for each point P ∈ C on the curve, its reflection across the x-axis is again a point on the curve, which is denoted by P−1 ∈ C. This allows us to define a multiplication ∘ : C × C → C on a curve (Figure 2) as follows.
Figure 2: Operations on an Elliptic Curve.
- Let P ≠ Q ∈ C be two distinct points on the curve C and let the line
intersect the curve C in the point
. The reflection Z−1 of Z across the x-axis is defined as the product of P and Q: P ∘ Q := Z−1
- Let R ∈ C be a point, let gR be the tangent to C at R, and let W be the intersection of the tangent with C: gR ∩ C = W. The reflection W−1 of W across the x-axis is defined as the product of R with itself: R2 := W−1.
- Let S ∈ C be a point whose tangent does not intersect C. For this purpose, the curve C is extended by a point at infinity 𝒩, and the product of S with itself is this point: S2 : = 𝒩.
- Let T ∈ C and T−1 ∈ C its reflection across the x-axis. The line connecting T and T−1 is parallel to the y-axis and has no intersection with the curve. The product of T and T−1 is defined as the infinitely far point 𝒩: T ∘ T−1 : = 𝒩.
With respect to this operation ∘, the reflection P−1 of a point P ∈ C across the x-axis is the inverse element of P, and the infinitely far point 𝒩 is revealed as the neutral element of the operation. The use of elliptic curves in cryptography is based on this operation and the corresponding.
Theorem 3: Let C be an elliptic curve. Then (C, ∘) is a commutative group with the neutral element 𝒩. ∎
For use of an elliptic curve in a cryptosystem the parameters a, b ∈ ℝ of the curve are published as well as a point G ∈ C called the generator. Each participant has a private key s ∈ ℕ and Gs =: P ∈ C as the corresponding public key. The following example shows how elliptic curve encryption works:
A message μ is encrypted as follows:
- μ is mapped to a point M ∈ C, e.g.:
- μ is considered as a bit string m ∈ ℕ
- Compute y with y2 = m3 + am + b
- Then M = (m, y) ∈ C
- Chose a random k ∈ ℕ
- Compute E1 = Gk and E2 = M ∘ Pk
- G is the published generator, s is the private key to compute P = Gs
- The encrypted message is χ(m) = (E1, E2)
An encrypted message χ(m) is decrypted as follows:
- Compute M = E2 ∘ (E1)−s
- It is (E1)s = (Gk)s = (Gs)k = Pk
- Thus, it is E2 ∘ (E1)−s = M ∘ Pk ⋅ (E1)−s = M ∘ Pk ∘ P−k = M
- The decrypted message is m = π 1(M )
Note, that in practice Elliptic Curve Cryptography (ECC) is not used for encryption of persistent data but for key exchange and signatures. For example, elliptic curve key exchange (a.k.a. ECC Diffie-Hellman) works as follows:
- As before, let G ∈ C be the generator of the curve
- Alice has
=: PA ∈ C as the public key, with sA ∈ ℕ as the private key
- Bob has
=: PB ∈ C as the public key, with sB ∈ ℕ as the private key
- Alice and Bob exchange PA, PB
- Alice computes KA =
- Bob computes KB =
- It is
- Thus, Alice and Bob share the same secrete key K ( = KA = KB)
- Alice and Bob can now use the shared secret key for symmetric encryption A public key is of the form P = Gs with the publicly known generator G ∈ C. This means that determining the private key s (“cracking elliptic curve cryptography”) means computing s = logG P where the exponential expG is computed based on “∘” as multiplication. Consequently, cracking elliptic curve cryptography is also reduced to compute discrete logarithms.
Note, that elliptic curves relevant for cryptography are curves whose coefficients are from a finite field 𝔽p instead of coefficients from ℝ. The principles sketched before remain valid, i.e. this complication is irrelevant for a first treatment and understanding.
3. Shor’s algorithm
The algorithm of Shor [1] for factorization is a hybrid algorithm, i.e. it consists of classical computations as well as quantum computations. The same paper describes a modification of the algorithm for computing discrete logarithms. Overall, both algorithms provide an exponential speedup compared with the best-known classical algorithms for solving these two problems. In this section, we very briefly sketch how numbers are factorized with Shor’s algorithm.
3.1. The outer classical part
The classical computations can be split into an “outer” classical part that surrounds both, the quantum part (section 3.2) as well as an “inner” classical part (section 3.3). The outer part computes the greatest common divisors and decides whether to iterate the overall algorithm or to stop with a divisor of the number to be factorized.
Let n ∈ ℕ be the number to be factorized (n can be assumed to be odd because otherwise its factor “2” is immediately known). Randomly, a number a ∈ ℕ, 1 < a < n is chosen. In the case of gcd(a, n) ≠ 1 a divisor of n has been found and the algorithm stops. The algorithm continues with gcd(a, n) = 1, and according to Lemma 3, the function expa(x) = ax mod n is periodic.
Next, the quantum part (section 3.2) and the inner classical part (section 3.3) are used to determine the period of expa(x) = ax mod n. If it turns out that p is odd, the algorithm starts over again with a different 1 < a < n because we need p even to apply the representation ap + 1 = (ap/2 − 1)(ap/2 + 1). Then, according to Note 4, after a couple of attempts, a number a will be found such that p is even, and that gcd(ap/2 − 1, n) or gcd((ap/2 + 1), n) is a proper divisor of n: the algorithm stops with a divisor.
3.2. The quantum part
The details of Shor’s algorithm especially its quantum part can be found in most text books on quantum computing like [3]. Here, we briefly sketch the key facts (ignoring some details to simplify the treatise).
First, the integers 0 ≤ x ≤ N (with N = 2m and n2 ≤ 2m < 2n2) are prepared in a two-part quantum register by applying a Hadamard transformation:
Next, an oracle is used to compute the value table of the modular exponential function of these numbers:
Note the similarity with the value table after Definition 6; as discussed there, based on a value table of a function its periodicity can be determined “by inspection”. Such an inspection can be done by applying a quantum Fourier transform on the | α ⟩-part of the quantum register, and a succeeding measurement of the | β ⟩-part. The resulting state of the | α ⟩-part is:
The quantum part ends with measuring the |α⟩-part of the register resulting in the value y.
3.3. The inner classical part
An extensive explanation of the inner classical part is given in Barzen and Leymann [7].
Note that the period p is already reflected in the amplitude of the state | y ⟩ reached after the Fourier transform (equation (13)): how can it be derived from this equation?
According to the Born rule of quantum physics, the probability Prob(y) to measure a particular y is the squared modulus of the amplitude of the associated state | y ⟩, i.e. based on equation (13):
The sum in equation (14) is a geometric sum, thus, the well-known formula for computing the value of such a sum can be used, and with
we get
A detailed analysis of equation (15) reveals (see [7], section 3, for the details) that “with high probability” a number k with 0 < k < p − 1 can be found that
Equation (16) is the pre-condition to apply Legendre’s Theorem on continued fractions [7] which states that in this situation,
is a convergent of
. Note that y is known because it has been measured, and N has been chosen at the beginning of the quantum part, i.e.
is known. Thus, N has been chosen at the beginning of the continued fraction representation [a0 ; a1 ,…, a1] of
can be derived, and one of the convergents
with 1 ≤ u ≤ l will be a “very good” approximation of
, i.e. the corresponding denominator hu will be a candidate for the period p (note, that the actual value of k is not needed because we are only interested in the denominator p). Whether or not this candidate hu is really the period or not must be checked explicitly.
Consequently, the inner classical part basically consists of a continued fraction analysis of
that produces the period p “with high probability”. In case a period is not found, the overall algorithm starts over again with a different 1 < a < n.
3.4. Quantum ressource requirements
Shor’s algorithm can efficiently break cryptosystems that are based on factorization or elliptic curves. While its classical parts can be performed on any classical computer today, its quantum part assumes a quantum computer that has error-corrected qubits and operations without errors.
Different authors discuss implementation variants of the quantum part of the factorization algorithm and state the number of logical qubits (i.e. error corrected qubits) and faithful operations [8]. As a very rough average, for RSA2048 the number of logical qubits needed is about 104 and the number of operations is about 1011 (and the operations tolerate low gate error rates of about 0.1%). A quantum computer with such resources is referred to as a crypto-relevant quantum computer (CRQC). At the time this paper has been written (2024), no such quantum computer is available. Note, that the quantum part computing elliptic curve discrete logarithms requires less quantum resources [9].
Considering that the number of physical qubits needed to implement a single logical qubit is estimated to be about 1.000 [10], and considering the corresponding gate errors, a quantum computer that can successfully perform the quantum part of Shor’s algorithm seems to be far out. However, recent advancements in quantum error correction indicate that orders of magnitudes fewer physical qubits are needed to realize a logical qubit [11,12]. Thus, a crypto-relevant quantum computer may be sooner available than assumed a few years ago.
To be able to solve problems on today’s error-prone quantum computers [13] so-called variational quantum algorithms are used [14]. These algorithms consist of a parameterized quantum circuit that produces a quantum state that is measured, this measurement result is evaluated by the classical part of the algorithms that optimizes the parameters, and the optimized parameter values are used by the parameterized quantum circuit again; this loop is performed until the overall procedure converges. Hereby, the quantum part runs only for a short period of time to avoid the errors on the quantum computer piling up, and only a few qubits are used - i.e. variational quantum algorithms are suitable for today’s Noisy and Intermediate-Scale Quantum (NISQ) computers [15].
Yan, et al. claim that their variational quantum algorithm can break RSA2024 with less than 400 qubits, although the number of operations of the quantum part of their algorithm is still too large [16]. Khattar and Yosri implemented this algorithm but failed to scale it beyond 70-bit numbers where the algorithm always failed in their experiments [17]. Thus, it is still open whether variational quantum algorithms can break security based on near-term quantum computers.
3.5. A note on symmetric encryption
Shor’s algorithm can break cryptographic schemes based on factorization or elliptic curves, which are asymmetric schemes. Symmetric schemes like AES 256 can be cracked by means of a quantum computer too via a brute-force attack [18]: This is because Grover’s quantum algorithm supports an unstructured search with a quadratic speedup [3].
But Grover’s algorithm too requires an error-corrected quantum computer. Wang, et al. presented a variational quantum algorithm to realize a brute-force attack on AES-like cryptographic schemes [19]. They also suggested a (non-variational) quantum circuit that can break AES 128 with about 400 logical qubits [20].
Thus, symmetric cryptographic schemes are volatile to quantum attacks too. In contrast to asymmetric schemes, a symmetric scheme can be hardened against such a brute-force attack by doubling the key size used: since an attack based on the Grover algorithm has quadratic speedup, doubling the key size requires the same effort on a quantum computer that has to be spent on a classical computer for a key of the original key size. Under certain practical assumptions, less than doubling the key size suffice [21].
4. Lattice-based cryptography
Cryptographic schemes that are based on lattices are less known than schemes based on factorization or elliptic curves. But they obey properties that make them candidates for quantum-resistant security mechanisms. In this chapter, we sketch the basics of lattice-based cryptography. Basics about lattices can be found in the lecture notes of Haase, et al. [22], and in addition, Peikert [23] and Silberman [24] discuss lattice problems and corresponding cryptography. Also, Zheng provides proof for many of the facts given here [25].
4.1. Basics about Lattices
A lattice is a set of points in a vector space that is built by taking all linear combinations with integer coefficients of a given set of linear independent vectors.
Definition 10: Let v1 ,…, vn ∈ ℝn be linear independent. The set
is called a lattice in ℝn with basis {v1,…,vn}.
In the literature a generalized definition can be found with k independent vectors v1 ,…, vk ∈ ℝn; k is then called the rank of the lattice, and for k = n, the lattice is called full rank. In what follows this generalization is not needed, thus, Definition 10 suffices. Figure 3 shows a lattice in ℝ2.
Figure 3: A Sample Lattice.
It is obvious that a lattice is an abelian subgroup of ℝn. Furthermore, a lattice is discrete: for each lattice Λ, an ε > 0 can be found such that for all x ≠ y ∈Λ it is
, i.e. two points can be separated by disjoint neighborhoods. In fact, being a discrete abelian subgroup of ℝn is equivalent to being a lattice in ℝn [22]:
Note 6: Λ ⊂ ℝn is a lattice ⇔ Λ is a discrete abelian subgroup of ℝn. ∎
The following observation is important for lattice-based cryptography:
Note 7: Let M ∈ GL(n, ℝ) (i.e. M is an invertible n × n matrix). Then, the set {x ∈ ℤn | Mx = 0} is a lattice in ℝn.
Proof: Let m1 ,..., mn denote the columns of M, i.e. M = (m1…mn). Because M is invertible, m1 ,..., mn ⊂ ℝn are linear independent. Thus,
{M x | x ∈ ℤn} = {∑xi mi | xi ∈ ℤ} = Λ < m1 ,…, mn >
is a lattice according to Definition 10.
It is obvious that {x ∈ ℤn | Mx = 0} is an abelian subgroup of ℝn. As a lattice, {Mx | x ∈ ℤn} is discrete. Obviously, a subset of a discrete set is discrete too, i.e.
{x ∈ ℤn | Mx = 0}⊆{Mx | x ∈ℤn}
is discrete. Note 6 proves the claim. ∎
4.2. Nearly orthogonal bases
A lattice has many different bases: let ℬ = {b1,…, bn} be a basis of the lattice Λ and let B be the matrix with columns b1 ,…, bn. For an n × n matrix 𝒰 with integer coefficients (i.e. 𝒰 ∈ ℤn×n) and with det 𝒰 = ± 1 (a so-called unimodular matrix), the columns c1 ,…, cn of the matrix C = B ⋅𝒰 build a basis 𝒞 = {c1,…,cn} of Λ (see the Figure 4 for an example).
Figure 4: Two Bases of a Lattice.
The two bases shown in Figure 4 have quite different behavior in terms of computing solutions for lattice problems discussed below: a basis with very long vectors and a very small angle between vectors typically requires higher computational effort than a basis with small vectors that are pairwise nearly orthogonal [24].
The ideal basis of the lattice ℤn is an orthonormal basis 𝒪 = {o1, …, o1}. It spans the unit cube the volume of which is given by the determinant | det O | = 1 where O = (o1…on) is the matrix with columns oi. Now, an arbitrary lattice Λ is not generated by an orthonormal basis, but by a basis ℬ = {v1,…,vn}. If {v1, …, vn} would be orthogonal (but not necessarily normalized), the volume of the cube spanned by {v1,…,vn} would simply be
. In case the vectors are not orthogonal they span a parallelepiped with volume | det B | = 1 where B = (v1…vn) is the matrix with columns vi. Now, the “goodness” of the basis ℬ can be measured by comparing the hypothetical cube spanned by orthogonal vectors of lengths
and the volume | det B | of the parallelepiped spanned by ℬ:
Definition 11: Let v1,…,vn ∈ ℝn be linear independent, ℬ = {v1,…,vn} and B = (v1…vn) the matrix with columns vi. The number
is called (orthogonality) defect of ℬ.
For any matrix M = (m1…mn) it is always det
(Hadamard’s inequality). Thus,
and consequently δ(ℬ) ≥ 1. If ℬ is orthogonal then det
, i.e. δ(ℬ) = 1 and vice versa: this explains the name “orthogonality” defect.
The Lattice Reduction Problem asks to determine a basis ℬ for Λ with minimla δ(ℬ): This problem is NP-hard. In practice, a good basis (not necessarily one with minimal orthogonality defect) of a lattice is chosen and becomes a private key, and a corresponding bad basis - which is computed based on a randomly chosen unimodular matrix - becomes the corresponding public key (see section 4.6).
Next, we present three basic lattice problems that are hard to solve and which are central to cryptography.
4.3. Shortest vector
For a given lattice Λ < v1,…,vn > and a norm
, the Shortest Vector Problem asks to find a non-zero vector of the lattice that has a minimal length (Figure 5a).
Figure 5a: The Shortest Vector of a Lattice.
Figure 5b: The Vector of a Lattice Closest to a Given Vector.
Definition 12: Let Λ < v1, …, vn > be a lattice and
be a norm on ℝn. With <
the Shortest Vector Problem (SVP) asks to determine an
x* ∈ Λ such that
.
The importance of this problem is because of [26].
Theorem 4: SVP is NP-hard for
and NP-complete for
. ∎
4.4. Closest vectors
For a given lattice Λ < v1, …, v > , a norm
and a vector w ∈ ℝn, the Closest Vector Problem asks to find a vector of the lattice that is closest to w (Figure 6).
Figure 6: Decrypting by Finding the Closest Vector to a Given Vector Representing a Massage.
Definition 13: Let Λ < v1 , …, vn > be a lattice,
be a norm on ℝn, and w ∈ ℝn. With dist (Λ, w) : =
the Closest Vector Problem (CVP) asks to x ∈ Λ determine an x* ∈ Λ such that
.
As before, the importance of this problem results from work by Micciancio ([26] and Manohar [27]).
Theorem 5: CVP is NP-complete.∎
4.5. Shortest independent vectors
The Shortest Independent Vectors Problem is a bit more comprehensive: a basis of the lattice has to be found whose longest vector has minimal length across all bases of the lattice.
Definition 14: Let Λ < v1,…,vn > be a lattice, and let 𝔅 be the set of all bases of Λ.
With
the Shortest Independent Vectors Problem
(SIVP) asks to determine a basis {s1,…, sn} of Λ with
.
Again, the importance of this problem is because of [26].
Theorem 6: SIVP is NP-complete. ∎
4.6. Example: Goldreich-Goldwasser-Halevi cryptosystem
We sketch a public key cryptosystem that is based on the Closest Vector Problem [28,29]. Let ℬ = {b1, …, bn} be a good basis of the lattice Λ (i.e. δ(ℬ) ≈ 1), B be the matrix with columns b1, …, bn, and let 𝒰 be a unimodular matrix (see section 4.2). ℬ is the private key of the cryptosystem. Next, the columns of
:= 𝒰B are a basis
of the lattice Λ with
1 (i.e.
is a bad basis).
is the public key of the cryptosystem.
The sender encrypts a message m as follows:
- The message m is mapped to a lattice point
- E.g. m is split into n blocks of characters and each block is mapped to an integer which is used as a coefficient in a linear combination with a good basis) - see m ∈ Λ in Figure 6.
- Randomly select a small vector e ∈ ℝn
- The encrypted message c is
- The receiver decrypts c by finding the vector closest to c as follows:
- Compute
- c ⋅ B−1 = (m
+ e) ⋅ B−1 = m
̂B−1 + eB−1 = m𝒰 BB−1 + eB−1 = m𝒰 + eB−1
- Round the result according to [30] to remove eB−1, i.e. m𝒰 is attained
- Finally, m = m𝒰𝒰−1
Note, that there is a well-known attack (Nguyen’s attack) on this cryptosystem [29], thus, it should not be used in practice.
4.7. Average case vs. worst case
Cryptography requires that randomly selected instances of problems are hard to solve [23], which is referred to as average-case hard. In contrast, the theory of algorithms typically considers hardness in terms of the existence of instances of problems that are hard to solve, which is referred to as worst-case hard. For example, computing a prime factor of a number is worst-case hard but not average-case hard: randomly selecting a number will result in an even number with 0.5 probability, and a prime factor “2” is immediately known.
The average-case hardness of certain lattice problems has been proven based on the worst-case hardness of some other lattice problems [31]. As a consequence, cryptographic mechanisms can be designed that are provably average-cased hard unless all (!) instances of some other lattice problem can easily be solved. Learning with errors is such a provable average-case hard problem (see next section).
4.8. Learning with errors
A ring R is a set with two binary operations denoted by “+” and “⋅”. In a ring, the pair (R, +) is a group, the multiplication “⋅” is associative, and there is a multiplicative identity (or unity) “1”. Both operations combined obey the laws of distributivity. The set of integers ℤ is the typical example of a ring. In a ring, elements typically have no multiplicative inverse, i.e. for an element a ∈ R there is no element b ∈ R such that b ⋅ a = 1. If every element of the ring has a multiplicative inverse (and “⋅” is commutative) a ring is called a field (like the real numbers ℝ or the complex numbers ℂ). See [32] for the details.
Let R be a finite ring, a1, …, an ∈ Rn and t1, …, tn ∈ R. Consider an unknown linear function f : Rn → R with f (ai) ≈ ti, i.e. it is f (ai) = ti but with small errors.
The Learning With Errors (LWE) problem [33] is to determine f based on the set of training pairs {(a1, t1), …, (an, tn)} ⊂ Rn × R.
This problem can be reformulated as follows: choose a fixed s ∈ Rn (“the secrete”), and pick uniformly random small elements (“the errors”) e1,…,en ∈ R. Then compute ti = ⟨ai, s⟩ + ei. Given {(a1, t1), …, (an,tn)} ⊂ Rn × R, find s. If A = (a1, …, an) ∈ Rn×n is defined as the matrix with columns a1,…, an, t = (t1,…,tn)T ∈ Rn and e = (e1,…,en)T ∈ Rn, then the LWE problem is seen as the problem of finding a solution s of the linear equation system
A ⋅ s + e = t (19)
In case no errors occur (i.e. e = 0) Gaussian elimination solves the linear equation system A s = t in polynomial time. But when errors are to be considered, the LWE problem turns out to be surprisingly difficult: Regev proves that if the LWE problem can be solved efficiently then solutions of the shortest vector problem SVP can be approximated by an efficient quantum algorithm [33]. Since even the approximate SVP is assumed to be hard, the LWE problem is hard too [34,35].
4.9. Learning with errors as a lattice problem
In cryptographic applications the ring R used in the LWE problem is ℤq for some prime number q ∈ ℙ: ℤq consists of the set of numbers {0,1, …, q − 1}. Two elements of ℤq are added and multiplied modulo q, i.e. for a, b ∈ ℤq it is a + b : = a + b mod q and a ⋅ b : = a ⋅ b mod q. With these operations, ℤq is a finite commutative ring [4].
With R substituted by ℤq, A ∈
and s, t , e ∈
, the LWE problem in the shape of equation (19) can be written as
where En ∈
is the identity matrix, and “|” denotes concatenation. According to Note 6, the set of integer solutions
is a lattice. Obviously, (s, e, 1)T ∈ ℤ2n+1 is such a solution, i.e. a lattice point. Especially, it can be shown that (s, e, 1)T ∈ Λ solves the shortest vector problem SVP for Λ, i.e. solving the LWE problem means to solve the SVP:
Note 8: The Learning with Errors problem is a lattice problem. ∎
Thus, cryptographic schemes based on the LWE problem belong to the domain of lattice-based cryptography. Micciancio and Regev provide examples of lattice-based schemes for encryption and signature [34]. We sketch a corresponding encryption schema in section 5.4.
5. Algebraic lattice cryptography
Key sizes in LWE get large (see section 4.6 for an example), they grow in the order of n2 [35]. Large keys are not acceptable in several application domains. Reducing these key sizes to grow linearly in n can be achieved by “adding more algebraic structure” to the lattice. More concrete, the group ℤq is substituted by the ring of polynomials ℤq[X]/(Xn + 1) without losing the hardness property of regular LWE [35]. Going even further, the ring of polynomials ℤq[X]/(Xn + 1) is extended to the module (ℤq[X]/(Xn + 1))k.
In this chapter, we provide the basics about polynomials and ℤq[X]/(Xn + 1), and also about modules ([32] for more algebraic depth). We sketch the use of these algebraically extended lattices (also called algebraic lattices) in cryptography (Micciancio and Regev in [34] and Regev in [35] provide further details). Based on such lattices Kyber as an encryption mechanism and Dilithium as a signature mechanism are sketched.
5.1. Polynomial rings
When substituting ℤq by the ring ℤq[X]/(Xn + 1) the resulting LWE problem is called the ring-LWE problem [36]. Beyond that, when the ring ℤq[X]/(Xn + 1) is even substituted by the module (ℤq[X]/(Xn + 1))k. The corresponding LWE problem is called module-LWE.
The advantage of moving from ring-LWE to module-LWE is as follows [37]: if the security level has to be increased in ring-LWE, the degree n of the polynomials must be increased which in turn increases the computational effort involved. In module LWE the degree of the polynomials can remain the same but the rank k of the module has to be enlarged. The advantage is that any optimization in manipulating polynomials remains the same.
Thus, we must understand the basics of polynomials and how to perform computations with them.
Definition 15: Let R be a commutative ring. A polynomial in X over R is an expression p = a0 + a1X + … + anXn with ai ∈ R; each ai is called a coefficient of the polynomial. With an ≠ 0 the number n is called the degree of the polynomial: n = deg p. R[X] denotes the set of all polynomials in X over R.
Polynomials can be added and multiplied: For
and
in R[X] we define the sum p + q of the two polynomials as
with ri = ai + bi (21)
and the product p ⋅ q of the two polynomials as
with
(22)
With these two operations on R[X], the following can be proven:
Lemma 4: (R[X], +, ⋅) is a commutative ring. ∎
In applications of algebraic lattices like Kyber (see section below), messages are considered as bit strings and they are transformed into polynomials as follows: for a message m = b0…bn with bi ∈ {0, 1} and bn ≠ 0, the polynomial from ℤ[X] representing m is P(m) = b0 + b1X + ⋯ + bnxn . When adding and multiplying polynomials their coefficients become large, especially when their degree (which corresponds to the number of bits of a message) is large - see formula (22). To avoid inefficiency in the corresponding computations with large numbers, all computations are taken mod q for a prime number q ∈ ℙ. Thus, the set of polynomials ℤq [X] is used which ensures that all coefficients of polynomials are less than q.
5.2. Residue classes of polynomials
Also, when multiplying polynomials the degree of the resulting polynomial gets large (i.e. the degrees are summed up - see equation (22)): this results in inefficiencies too. Consequently, the degree of polynomials is limited by using residue classes of polynomials. If f is an irreducible polynomial (i.e. f is not the product of two nonconstant polynomials) then the set of all residue classes ℤq [X] / f is again a commutative ring. The polynomial 1 ⋅ X0 is the identity element in ℤq [X] / f. Furthermore, if deg f = n then deg g ≤ n − 1 for each g ∈ℤq [X] / f. Thus, ℤq [X] / f contains n ⋅ q elements, i.e. ℤq [X] / f is a finite ring. All hardness properties of the regular LWE problem are inherited by the ring-LWE problem over the ring ℤq [X] / f. In summary:
Lemma 5: Let f ∈ℤq [X] be irreducible. Then, ℤq [X] / f is a finite commutative ring with unity and card ℤq [X] / f = q ⋅ deg f. ∎
Computing the residue class of a polynomial g ∈ℤq [X] requires dividing g by f and taking the remainder of the division as the result. We briefly remind polynomial division by means of the following example:
- Let f = x2 + 1 be an irreducible polynomial in ℤ[X]
- Let p = 4x5 − x4 + 2x3 + x2 − 1 ∈ ℤ[X]
- By computing (4x2 − x4 + 2x3 + x2 − 1) ÷ (x2 + 1) and determining the remainder of this division, p mod f ∈ℤ[X] / (X2 + 1), is computed
- The computation is in analogy to the division of numbers (Figure 7).
Figure 7: Example of a Polynomial Division.
- Determine the factor of x2 of the denominator to result in the highest term 4x5 of the numerator of the division: this is 4x3, the first summand of the result of the division.
- Multiply 4x3 by the denominator x2 + 1 which results in 4x5 + 4x3
- Subtract 4x5 + 4x3 this from the numerator
- The highest term of the remaining polynomial is −x4
- Determine the factor of x2 of the denominator to result in this highest term −x4 of the remaining numerator: this is −x2, the second summand of the result of the division
- Multiply x2 by x2 + 1 which results in x4 + x2 …
- …
- The remainder of this polynomial division is 2x − 3
- Thus, (4x5 − x4 + 2x3 + x2 − 1) ≡ (2x − 3) ∈ℤ[X]/(X2 + 1)
Note, that the coefficients of the polynomials in the example are from ℤ. Computations in ℤq [X]/(X2 + 1) must perform all computations involving coefficients of the polynomials mod q.
5.3. Modules of polynomials
As mentioned above, increasing the security level of a cryptographic mechanism based on ring-LWE over ℤq [X] / f requires to increase the degree n of the polynomials, which in turn increases the involved computational effort. To avoid this, the ring ℤq [X] / f is substituted by (ℤq [X] / f )k; the latter algebraic structure is called a module.
Definition 16: A module M over a ring R is a commutative group (M, +) with an operation ⋅ : R × M → M of R on M such that the laws of distributivity are satisfied, i.e. (r1 + r2) ⋅ m = r1m + r2m and r ⋅ (m1 + m2) = r ⋅ m1 + r ⋅ m2.
If instead of a ring R a field 𝕂 is used in the definition, the module over 𝕂 becomes a vector space over 𝕂. Thus, modules are generalizations of vector spaces. But in general, modules are quite different from vector spaces, e.g. not every module has a basis, and if a module has a basis then different of its bases may have different cardinalities.
The cartesian product Rk of a ring R is a module over the ring R: elements of Rk are added component-wise, and multiplication of an element of Rk with an element of R is done component-wise too - just like vectors in the coordinate space 𝕂k are added and multiplied with scalars from 𝕂. Thus, based on Lemma 5 we get
Lemma 6: Let f ∈ℤq [X] be irreducible, k ∈ ℕ. Then, (ℤq [X] / f )k is a finite module with card (ℤq [X] / f )k = (q ⋅ deg f )k. ∎
(ℤq [X] / f )k is a finite module because card ℤq [X] / f = q ⋅ deg f according to Lemma 5, thus card (ℤq [X ]/f )k = (q ⋅ deg f )k < ∞.
The notion of a lattice can be generalized to the module (ℤq [X] / f )k. For this purpose, ℤq [X] / f is considered as a set of tuples by mapping a polynomial a0 + a1X + … + amXm to the tuple (a0, a1 ,…, am,0,…,0) ∈
, i.e. the tuple consisting of the coefficients of the polynomial. This mapping is performed for each component polynomial pi of (p1, …, pk) ∈ (ℤ1 [X] / f )k, and the tuples are combined. This way, each element of (ℤq [X] / f )k becomes an element of
where lattices (of rank k ⋅ deg f) can be defined.
In order to be able to formulate the lattice problems introduced in sections 4.3, 4.4, and 4.5 a norm is needed on (ℤq [X] / f )k . But the canonical scalar product ⟨(ai),(bi)⟩ = ∑aibi induces the canonical norm
. Thus, lattice problems in (ℤq [X] / f )k result.
Figure 8 depicts this situation: The module which is the foundation of an algebraic lattice is a cartesian product of a finite polynomial ring. Accordingly, the elements of the module are tuples of polynomials. Computations in this module are performed mod q w.r.t. the coefficients of the polynomials, and mod f w.r.t. the polynomials themselves. Different algebraic lattice-based cryptographic mechanisms (like Kyber or Dilithium) use different prime numbers q, different irreducible polynomials f, and different ranks k.
Figure 8: Geometric Situation of Algebraic Lattices.
5.4. Kyber
Kyber is an algebraic-lattice-based asymmetric cryptographic mechanism based on a module-LWE problem described above. The specification of Kyber can be found in [38]: it specifies all algorithms used for performing the calculations with polynomials, for rounding, for sampling error vectors, etc. Kyber is using Rq : = ℤq[X] / (Xn + 1) with n = 256 and q = 3329.
5.4.1. Computing Keys: The private and public keys are determined as follows:
- A randomly chosen
becomes the private key. Here, Kyber is using k ∈ {2, 3, 4} as the key length. This key length varies the lattice rank k ⋅ n and, thus, the security level implied. For example, with k = 3 the lattice rank becomes 768 and the corresponding variant KYBER768 has the security level of AES192.
- Next, a matrix
is randomly chosen. The module-LWE requires a small randomly generated error vector
, and
is computed. The pair (A, t) becomes the public key.
5.4.2. Encryption: For each message m to be encrypted the following is performed:
- First, the following elements are randomly generated: a so-called randomizer vector
as well as an error vector
and a single error polynomial
, both with small coefficients.
- The message m is considered as a bit string m = b0…bn−1 and is represented as a polynomial P(m) = b0 + b1X + ⋯ + bn−1Xn−1 (i.e. bi ∈ {0,1}); in this sense, m and P(m) are used interchangeably.
- P(m) is “scaled” by multiplying it by ⌊q/2⌉, i.e. ⌊q/2⌉⋅ P(m) is computed. As a reminder, ⌊x⌉ is the integer closest to x, e.g. ⌊2.4⌉ = 2 or ⌊2.5⌉ = 3. Note, that the coefficients of the latter polynomial are 0 or ⌊q/2⌉ because the coefficients of P(m) are 0 or 1.
- Then, u = ATr + e1 and v = tTr + e2 + m are computed.
- The encrypted message is (u,v).
5.4.3. Decryption: The encrypted message (u, v) is decrypted as follows:
The encrypted message (u, v) is decrypted as follows:
- First,
is computed by using the secret key s.
- Note, that
is not yet m, but it contains some noise:
- Since e, e1, e2 are "small" by definition, the noise
has "small" coefficients too. Thus, the coefficients of
are close to the coefficients of ⌊q / 2⌉⋅ P(m) which are 0 or ⌊q / 2⌉ (see 5.4.2. (3)).
- Next, the following “rounding” of the coefficients of
is performed resulting in ρ(
):
- All coefficients of
close to ⌊q / 2⌉ are set to “1”.
- All coefficients of
close to 0 (or close to q, which is congruent 0) are set to “0”.
- Finally, ρ(
) = m is (with very high probability) the original message.
There is a small probability that ρ(
) is not the original message m, but this failure probability is extremely small (see [39], Theorem 1): for example, for k ⋅ n = 512 the failure probability is 2−139.
[39] also explains the reasons for the hyper-parameters chosen for Kyber in more detail. The rounding mechanism is in fact more sophisticated than presented here and is described in [30].
5.4.4. Kyber by example: In what follows, we describe Kyber stepwise by example (remark: no guarantee that the computations [involving lots of polynomial multiplications, frequently taking coefficients mod 7 and polynomials mod (X4 + 1)] are error-free!). We use small hyperparameters to highlight the essential processing: As hyperparameters, we set n = 4, q = 7, and k = 2. Thus, our finite polynomial ring is R7 = ℤ7[X] / (X4 + 1).
First, we need to choose a private key and compute a corresponding public key:
- As private key we choose
- A ∈ R72×2 is chosen randomly as
- In order to computed t = A s + e we need a small error vector
and choose e = (x2, x)
- The computation results in (considering the coefficients mod 7 and the polynomials mod (X4 + 1))
Thus, the public key (A,t) is
The message to encrypt is μ = ITS ALL GREEK TO ME (see section 2.4). For simplicity, we only consider the first letter, i.e. m = 9 and we get m = 1001 in binary representation. Next, m must be mapped to a polynomial by means of the map P resulting in
P(m) = x3 + 0x2 + 0x + 1 = x3 + 1;
scaling this polynomial by ⌊q / 2⌉ = ⌊7 / 2⌉ = 4 gives m = 4x3 + 4 (as before, we identify m and P(m)).
- To encrypt m we choose r = (x2, 1)T ∈
as randomizer, an error vector e1 = (x + 1,1)T∈
and as error polynomial e2 = x3 + x ∈ R7.
- Then we compute u = ATr + e1 considering mod 7 w.r.t. the coefficients and mod (X4 + 1) w.r.t. the polynomials:
- Next, we compute v = tTr + e2 + m (again based on the residue classes for the coefficients and the polynomials):
- Thus, the encrypted message is:
In order to decrypt the message, the recipient must first compute m = v − sTu based on the private key s. Thus:
Computing the coefficients mod 7 results in
and finally rounding the coefficients by ρ, i.e. coefficients close to ⌊q / 2⌉ = 4 are set to “1”, and coefficients close to 0 (or close to q = 7) are set to “0” delivers
Thus, ρ (
) = P(m) and the original message m = 1001 results.
5.4.5. Kyber as key encapsulation mechanism: As ECC Diffie-Hellman discussed in section 2.6, Kyber is used as a key encapsulation mechanism (KEM): The asymmetric encryption/decryption protocol described before is mainly used to agree on a symmetric key. Once the symmetric key is agreed on, the information exchanged is encrypted and decrypted based on the shared secret.
5.5. Dilithium
We sketch the essence of Dilithium in this section. Note, that the details are more subtle than sketched (e.g. “rounding” means to use high-order bits of the coefficients of the affected polynomials); the details can be found in [40]. Richter, et al. summarize Dilithium as well as other post-quantum cryptography algorithms [41].
Like Kyber, Dilithium is an algebraic-lattice-based cryptographic mechanism. But instead of encryption/decryption, it is about signatures. Dilithium is also using Rq = ℤq[X] / (Xn + 1) with n = 256 but q is much bigger in Dilithium than in Kyber: q = 223 − 213 + 1 = 8380417.
5.5.1. Key generation: Key generation works like in Kyber: a uniformly random
with small coefficients becomes the private key. Next, a uniformly random
as well as a uniformly random
with small coefficients is chosen; based on the latter two
is computed. The pair (A, t) is the public key. m and k are chosen in Dilithium in order to result in different security levels [42], i.e. (m, k) = (4,4) for security level 2, (m, k) = (6,5) for security level 3, and (m, k) = (8,7) for security level 5.
5.5.2. Signature generation: Signatures are computed as follows:
- For each message μ to sign, freshly choose uniformly random
with small coefficients.
- Compute
- ρ is the algorithm rounding coefficients to 0 or
⌊q/2⌉
- Compute c = Ψ(μ∥w)∈ Rq
- Ψ is a hash function based on the function SHAKE 256 (which is a variant of the SHA-functions [43]) as well as SHA-256 itself [44].
- The hash function maps its argument to Bh ⊂ ℤq[X] / (Xn + 1) which is a set of polynomials with coefficients in {−1, 0, + 1} and exactly h coefficients not being “0”. Dilithium is using h = 60 resulting in more than 2256 potential polynomials.
- 1. Compute
- The signature is (r2, c). It is sent together with the message μ to the recipient. The recipient can verify the signature as discussed in section 5.5.3. [45] proves that Dilithium is a so-called self-target Module Short Integer Solution Problem (MSISP), i.e. a lattice problem: Given a message μ, find
with
where y = (r1, c, r2)T with r1
, c ∈ B60 ,
such that Ψ(μ||(A | s | Ek)⋅ y) = c ∈ B60. Here, (A | s | Ek) is the matrix consisting of the matrix A (part of the public key) followed by a single column s (the private key) followed by the identity matrix Ek; the similarity to equation (20) already roughly indicates that this is a lattice problem.
5.5.3. Signature verification: When the signed message (μ(r2, c)) is received, the following computations are performed to verify the signature (the correctness of this verification procedure is shown in the next section):
- Compute
.
- (A,t) is the public key.
- (r2, c) is the signature to verify.
- Compute
.
- μ is the message claimed to be signed.
- In case
then the signature is valid, otherwise the signature is invalid.
5.5.4. Correctness of signature verification: The following shows that the signature verification procedure is correct:
The first equality (a) results from substituting r2 = r1 + cs and t = As + e. (b) is a simple multiplication, followed by canceling the two terms in the middle in (c). (d) is valid because e has been chosen with small coefficients while c ∈ B60 has coefficients from {−1, 0, + 1}, i.e. they are small too, thus, ec has small coefficients which imply that rounding ρ is not affected by ec. (e) uses the definition of w. The verification succeeds.
Assume that r2 has not been calculated with the correct private key s (i.e. the substitution in (a)), i.e. the signature is bogus. But t as part of the public key has been computed with the correct private key s. Consequently, the two terms in the middle of (b) would not cancel out and the equation in (c) would not hold. Thus, ρ(Ar2 − tc) would result in a value
. This implies that the step (2) in the verification process results in
: the verification fails.
6. Why should I care?
There is a whole spectrum of opinions or estimations, respectively, about when powerful quantum computers will be available (and some even argue that such computers can not be built at all). The first question to ask is: “Powerful for which purpose”. It is important to note that different application areas require different characteristics of a quantum computer (in terms of the number of qubits, decoherence time, gate fidelity, etc) to qualify as “powerful”. In our context, a quantum computer that can threaten today’s cryptographic schemes by being able to execute Shor’s algorithm, for example, is referred to as Cryptographic Relevant Quantum Computer (CRQC) (see section 3.4).
Scholten, et al. estimate that quantum computers powerful enough for several application areas will be available “in the late 2020” and that cryptographic-relevant quantum computers are (a bit) further out [46]. However, CRQC is already threatening security now (see section 6.1). The time a particular company has the counter this threat can be roughly estimated (see section 6.2). Awareness of this threat is growing in society (section 6.3), and standards for a quantum-safe infrastructure are being created (sections 6.4 and 6.5).
6.1. “Harvest now, decrypt later”
Cryptographic-relevant quantum computers not only threaten security once they are available but they do already now. An often perceived attack is referred to as “Harvest Now, Decrypt Later” (HNDL): here, an attacker can steal data that is encrypted based on an asymmetric scheme, store it, and decrypt it once a CRQC is available. If data that must be kept confidential and that is persistently stored is encrypted symmetrically, such data at rest is considered quantum-safe (in the sense of section 3.5). However, data at rest that is asymmetrically encrypted is in danger.
Data in motion (e.g. messages or documents exchanged) is typically encrypted based on a symmetric session key which has been agreed upon at the start of the session based on asymmetric encryption. If this message exchange is captured and stored, the data in motion can be decrypted at a later point in time once a CRQC is available: the asymmetric keys will be decrypted allowing to derive of the symmetric key to decrypt the proper data.
6.2. Mosca’s inequality
Thus, companies should take proper actions to protect themselves from attacks by a CRQC. These actions include migrating to a cryptographic infrastructure that is quantum-safe. In Figure 9, this time is called TMigrate. During this time period, and especially at the end of this period, data is very likely generated by the company that must be kept confidential for a company-specific time, called TConfi in Figure 9. This means that a company must ensure that encrypted data can not be decrypted for a period of TMigrate + TConfi.
Figure 9: Graphics of Mosca’s Inequality.
The time needed to develop a CRQC is denoted by TCRQC in the figure. Once TCRQC is passed, data is at risk and can be decrypted: a “crypto collapse” may occur. While TConfi is typically well-known by each company itself, and TMigrate can be estimated by the project management techniques of a company, TCRQC is out of the control of a company. Roadmaps of vendors of quantum computers have been observed and tracked to derive TCRQC. Also, assessments of consulting firms can be considered.
Once the numbers are available, the following inequality known as Mosca’s Inequality [47] can be evaluated:
TMigrate + TConfi > TCRQC (24)
In case the inequality (24) holds, a company may get into trouble: the time data must be kept secrete and the time needed to move towards a post-quantum cryptographic infrastructure is bigger than the time needed for (some) vendors to provide a cryptographic relevant quantum computer. Thus, the data of the corresponding company is at risk. If possible, the migration time TMigrate may be shortened in order to make TMigrate + TConfi ≤ TCRQC hold. But in case TConfi > TCRQC already holds, the company is in trouble.
6.3. Awareness
Considerations like the ones outlined in the previous section also apply to governments and policymakers in general. As a result the US government, for example, enacted a law [48] “…to encourage the migration of Federal Government information technology systems to quantum-resistant cryptography…”. In reaction to this law, a strategy to migrate to post-quantum cryptography has been worked out [49], and the cost to migrate the IT systems of the US government has been estimated to be higher than 7 billion US$ by 2035.
Similarly, the European Commission recommends a coordinated implementation roadmap for the transition to post-quantum cryptography for its member states [50] “…necessary for Europe to look for stronger safeguards, ensuring the protection of sensitive communications and the long-term integrity of confidential information, i.e., by switching to Post-Quantum Cryptography as swiftly as possible”. Furthermore, the intend for this roadmap is “…to foster the transition to Post-Quantum Cryptography for the protection of digital infrastructures and services…”.
The DigiCert & Ponemon Institute published a study in 2023 about how industrial organizations worldwide are addressing post-quantum-safeness [51]. About 1500 practitioners from the US, EU, and AP have been surveyed who claimed to be acquainted with their organization's work in the area of post-quantum cryptography. The result is that 61% of the practitioners are very much concerned about their organizations not being prepared to address the security threats posed by quantum computing. Even 74% are concerned about “harvest now, decrypt later” attacks. 23% of the respondents have a strategy towards post-quantum-safeness, and the same percentage (23%) have even no plan to build such a strategy.
In summary, it seems to be that government organizations take post-quantum security more seriously than industrial organizations.
6.4. Standardization
Standards are important for post-quantum security. For example, communication partners must agree on a joint protocol to secure their data exchange. The National Institute of Standards (NIST) is in the process of standardizing quantum-safe algorithms [52,53]. The process consists of several phases: first, submissions of quantum-safe algorithms have been solicited. The submitted algorithms are evaluated, and positively evaluated algorithms become candidates for standards. These candidates are then turned into specifications. The specifications are then reviewed by a global community. Those specifications that did not fail the review became Federal Information Processing Standards (FIPS).
Especially, Kyber and Dilithium have been standardized [54,55]; note that NIST officially uses the names Module-Lattice-Based Key-Encapsulation Mechanism or ML-KEM, and Module-Lattice-Based Digital Signature or ML-DS in its specifications, which reflects the mathematical underpinnings of the algorithms. Other algorithms are under standardization by NIST also or remain candidates for future standardization. Richter, et al. give an overview of the mathematics of the algorithms finally considered by NIST for standardization [41].
To support crypto-agility (next section), NIST identified 14 candidates for standardization in a second round [56]. These candidates are signature algorithms that will be analyzed according to the phases sketched before.
6.5. Crypto-agility
The algorithms that are currently standardized are believed to be quantum-safe. I.e. there are no known classical algorithms or quantum algorithms that can crack them. But there is no guarantee (in the sense of a mathematical proof) that they cannot be broken in the future. In fact, there is some evidence that a couple of them will be finally broken [57]; and one of the NIST candidates has already been broken [58]. However, based on the best knowledge of the majority of the cryptographic community the standards currently being rolled out are post-quantum-safe.
Because of the volatility of being post-quantum-safe, any cryptographic infrastructure that is considered to be post-quantum-safe in total must cope with situations in which one or more of the algorithms that are considered to be quantum-resistant today will be cracked. It must be possible to easily extract the corresponding algorithms from the infrastructure such that they will not be used anymore. Furthermore, it should be possible to substitute these algorithms with others that are still considered to be quantum-resistant at this point in time. Thus, the infrastructure must be crypto-agile [59].
Figure 10 depicts the high-level architecture of such an agile cryptographic infrastructure. It is plugin-based, i.e. the implementations of the algorithms are plugins that can be easily exchanged. Also, more plugins can be added whenever implementations of new post-quantum-safe algorithms appear.
Figure 10: Plugin-Based Cryptographic Infrastructure.
Furthermore, algorithms can be combined. I.e. a classical algorithm C is not necessarily replaced by its quantum counterpart Q, but both algorithms are used together (a.k.a. hybrid cryptography). For example, a document may be encrypted based on the classical AES algorithm as well as based on the quantum-safe Kyber algorithm. This is an example of the C → Q approach where first the classical algorithm C is applied and next the corresponding quantum algorithm Q is applied on the result of the preceding step. Obviously, Q → C is also a valid approach. Classical and quantum algorithms might also be applied in parallel, i.e. a C∥Q approach is followed. For example, a signature may be computed based on a classical algorithm as well as on a quantum algorithm, and finally, both signatures are associated with the signed object. Figure 11 depicts the C → Q approach for signing a message: first, the message is signed classically by using the Elliptic Curve Digital Signature Algorithm (ECDSA), and the resulting object is signed by means of Dilithium. The advantage of such combined approaches is that they are at least as secure as the current classical algorithm C, but it allows to gain trust in the quantum algorithm Q and its implementations.
Figure 11: Signing a Message Using the C→Q Approach.
6.6. Potential impact in practice
A question often heard is about the impact of the threat posed by quantum computers on one's own practice. Not surprisingly there is no generally applicable answer to this complex question. Several firms offer corresponding services. Also, publications like the ones by DigiCert & Ponemon Institute [51], Macaulay and Henderson [59], or the National Cybersecurity Center of Excellence [60] (to name but a few) may provide initial guidance.
However, the first step is to understand the problems and associated technologies (the article at hand may help). Next, vulnerable areas (e.g. data in transit, data stored) have to be identified; Mosca’s inequality (see section 6.2) may give indications about the urgency to act. Based on this, influences on the current IT infrastructure should be analyzed. Available standards (see section 6.4) are helpful in this. Likely, the emerging architecture should be agile (see section 6.5). Prototypes based on open-source implementations (see Chapter 7) will be helpful.
7. In brief: Sample open source activities
Several of the algorithms that are standardized by NIST (and others) are available as open-source implementations. Also, other implementations that use this code in communication protocols, for example, are available in open source too. In this chapter, we sketch a particular stream of open-source activities having significant industry support while other activities are not mentioned at all.
7.1. Open quantum safe
A consortium named Post-Quantum Cryptography Alliance (PQCA) members of which include AWS, CISCO, Google, IBM, etc. “aims to support the transition to quantum-resistant cryptography” [61]. Amongst other activities, PQCA runs a project called Open Quantum Safe (OQS) that builds implementations of post-quantum algorithms and usages thereof [62].
OQS provides a library called liboqs (see next section) for algorithm implementations. These implementations are also used to build quantum-safe implementations of TLS, SSH, etc. [65]. The latter implementations are built by providing branches of OpenSSL, OpenSSH, for example, that make use of liboqs. The quantum-safe branch of TLS, for example, has been used to demonstrate how to enable Apache httpd or Nginx, for example, to become quantum-safe.
Figure 12 shows how HTTPS can become quantum-safe: HTTPS is HTTP over TLS. Thus, HTTPS can be turned into a quantum-safe protocol by making TLS quantum-safe. Classically, TLS is using by default an encrypted secret random number-based mechanism or it can be set up to exploit Diffie-Hellman (refer to section 2.6). By using Kyber as a key encapsulation mechanism TLS becomes quantum-safe and this implies the quantum-safeness of HTTPS too.
Figure 12: Making HTTPS Quantum-Safe.
7.2. Open source activities
A collection of implementations of quantum-safe algorithms is provided by OQS under an MIT license as an open-source C-library on GitHub [64]. Wrappers for other programming languages are provided, for example, Java, Python, and C++. Besides implementations of Kyber and Dilithium (or ML-KEM and ML-DS, respectively) the library contains also implementations of other algorithms for key encapsulation mechanisms and signature schemes.
It is explicitly mentioned on the websites of liboqs that the library is of prototypical nature only, i.e. it should not be used in production environments and it should not be used to protect sensitive data. In contrast, the goal of the Post-Quantum Cryptography Coalition PQCC [65] (whose founding members include members of the PQCA) lists explicitly the creation of corresponding production-quality code, and ensuring cryptographic agility (refer to section 6.5).
7.3. Sandwich
Sandwich is an open source implementation that provides cryptographic agility enabling post-quantum-safeness by using liboqs [66]. It defines a unified API for choosing cryptographic algorithms and protocols at runtime. Thus, no code must be modified to exchange algorithm implementations (but proper configuration files must be passed at runtime).
8. Conclusion
In this paper, we described the origins of the need for efforts subsumed by the term post-quantum security. The relevant number theoretic background has been provided as the underpinning of security schemes based on factorization. The algebra of elliptic curves has been sketched and security schemes on top of it were described. We revealed that both schemes are in fact relying on the hardness of computing discrete logarithms. Shor’s algorithm was explained in brief, showing that security schemes relying on discrete logarithms are broken as soon as powerful enough quantum computers become available (which may very well be within the next couple of years).
Lattice-based cryptography is seen as a rescue. We introduced lattices and fundamentally hard problems that provide the base for corresponding security schemes. To avoid huge keys that are implied by high security levels required in practice, algebraic lattices have been introduced and two corresponding cryptographic schemes called Kyber and Dilithium have been discussed.
Standardization efforts in this domain have been reported, and related open-source activities were mentioned. The “harvest now, decrypt later” attack was described and its impact on organizations was revealed. Crypto-agility as an important principle has been emphasized.
Author contributions
Writing – original draft, F.L.; Writing – review & editing, J.B. All authors have read and agreed to the published version of the manuscript.