You could have broken SIDH
In October 2017, I gave a presentation titled You could have invented Supersingular Isogeny Diffie–Hellman, intended as a gentle and relatively down‑to‑earth introduction to SIDH. It was the first time I spoke publicly during my PhD. Back in those days, I was trying very hard — and failing similarly hard — to put a dent in the security of SIDH.
Now, almost 5 years later, SIDH has finally given in: A devastating attack due to Wouter Castryck and Thomas Decru [paper] breaks all proposed SIKE parameter sets in mere hours on a laptop (and this has been sped up further since). Fundamentally the same attack strategy was semi‑independently1 discovered by Luciano Maino and Chloe Martindale [paper]. Both of these initial versions of the attack came short of completely breaking all variants of SIDH, spurring some hopes and dreams that the scheme might be salvageable after all by carefully choosing new parameters. However, just a few days after the initial announcement, Damien Robert realized how to fully weaponize the trick that was implicit in Castryck–Decru and Maino–Martindale's application of Kani's theorem. His approach relies on some extremely heavy computer‑algebra machinery and might actually be less practical for parameters of cryptographic interest, but it is conceptually enlightening and leads to an unconditional polynomial‑time attack on all previously proposed SIDH variants [paper].2
I've heard people characterize the attack as a stroke of luck in which a bunch of researchers finally stumbled upon the right obscure 25‑year‑old SIDH‑killer theorem by pure chance, prompting the scheme to spontaneously combust in a violent manner. I don't believe this interpretation is entirely correct: While the eventual slayers of SIDH did discover the crucial technical tool for the attack in a somewhat roundabout and surprising way,3 especially Damien's incarnation of the attack really does bear a glaring resemblance to older SIDH almost‑attacks. In the following, benefiting from the virtue of hindsight, I'll try to explain how we've been missing the forest for the trees.
The problem
Here's a short summary of the SIDH public‑key problem, omitting details and liberally applying polynomial‑time equivalences: The public data are elliptic curves \(E_0\) and \(E\) connected by a secret isogeny \(\varphi\colon\; E_0\to E\) of known smooth degree \(A\), and a computationally efficient representation of the restriction \(\varphi\rvert_{E_0[B]}\) of \(\varphi\) to the \(B\)‑torsion subgroup \(E_0[B]\) where \(B\) is a smooth integer coprime to \(A\). Parameters are chosen such that computing in \(E_0[AB]\) and \(E[AB]\) is efficient.
The goal is to recover the kernel of \(\varphi\). It is not overly hard to see that this reduces to evaluating \(\varphi\) on the \(A\)‑torsion: Since \(A\) is smooth, the matrix of \(\varphi\) on \(E_0[A]\) can be computed by evaluating \(\varphi\) on a basis of \(E_0[A]\) and computing a few two‑dimensional discrete logarithms, and deducing the kernel from the matrix is just linear algebra modulo \(A\).
Naïve approaches to evaluating \(\varphi\) on the \(A\)‑torsion given its restriction to the \(B\)‑torsion are doomed to fail: As groups, we have a direct‑product decomposition \( E_0[AB] \cong E_0[A] \times E_0[B] \), which suggests one cannot deduce anything about \(\varphi\rvert_{E_0[A]}\) from \(\varphi\rvert_{E_0[B]}\) using group‑theoretic means alone. Computing an explicit equation for \(\varphi\) from its restriction to the \(B\)‑torsion is information-theoretically possible and boils down to a polynomial-interpolation problem, but the output is exponentially big, hence this cannot be efficient. (More about how to not break SIDH can be found in my paper with Chloe!)
Torsion‑point attacks
Until a little more than three weeks ago, the only nontrivial4 attack on SIDH variants was the family of "torsion‑point attacks"5 pioneered by Christophe Petit [paper]. The core idea is to construct an endomorphism \(\tau\) on \(E\) whose definition incorporates information from \(\varphi\), such that the degree of the endomorphism shares a large divisor with \(B\). Evaluating the endomorphism \(\tau\) on the \(B\)‑torsion is possible using the given restriction \(\varphi\rvert_{E_0[B]}\), and in some cases this suffices to recover \(\tau\) explicitly. Unwrapping the definition of \(\tau\) then reveals \(\varphi\).
Concretely, Christophe suggested using endomorphisms \(\tau\) of \(E\) defined as \[ \tau = \varphi \circ \vartheta \circ \widehat\varphi + [d] \,\text, \] where \(\vartheta\) is an endomorphism of \(E_0\) and \(d\in\mathbb Z\) defines the scalar multiplication \([d]\) on \(E\). Let's suppose for the moment that we can find a pair \((\vartheta,d)\) such that the degree of \(\tau\) equals \(B\). Note that we don't really "have" \(\tau\) since \(\varphi\) is unknown, but there is something we can do: Using the given restriction \(\varphi\rvert_{E_0[B]}\), we can apply some linear algebra to deduce \(\widehat\varphi\rvert_{E[B]}\), and hence we can evaluate \(\tau\rvert_{E[B]}\) as \[ \tau\rvert_{E[B]} = \varphi\rvert_{E_0[B]} \circ \vartheta \circ \widehat\varphi\rvert_{E[B]} + [d] \,\text. \] Computing \(\tau\) then proceeds as outlined earlier: Evaluate on the \(B\)‑torsion, compute discrete logarithms (easy as \(B\) is smooth), recover \(\ker\tau\) and use it to construct \(\tau\) explicitly. Then evaluate \[ \tau-[d] \] on the \(A\)‑torsion and use discrete logarithms and linear algebra to recover \[ \ker(\tau-[d]) \cap E[A] \,\text. \] Since \(\tau-[d] = \varphi\circ\vartheta\circ\widehat\varphi\), this subgroup clearly contains \(\ker\widehat\varphi\), and one can show that in cases of interest the index is small enough to identify \(\ker\widehat\varphi\) inside the larger kernel.
The major issue with Christophe's approach is that we simply can't expect there to be suitable pairs \((\vartheta,d)\) when \(A\) and \(B\) are of comparable size: Going back and forth via \(\varphi\) means the degree of \(\tau\) will be at least \(A^2\), and things get much worse if we also analyze the condition that we'd like the degree of \(\tau\) to share a large factor with \(B\).
In a more recent paper we made some appreciable progress regarding these restrictions, including a slightly different strategy to construct a map playing the role of \(\tau\), and bumping the right‑hand side from \(B\) to \(B^2\) by exploiting the dual isogeny. With these tricks, the subset of the parameter space breakable by torsion‑point attacks was beginning to creep up on "standard" SIDH, but the desperately needed additional improvements to cover the final stretch simply didn't emerge.
The solution
Now, here's the thing: Damien's version of the attack follows almost exactly the same strategy, augmented with one brilliant trick: \[ \textbf{more dimensions} \implies \textbf{more endomorphisms} \text. \] Below, I will first show how to plug this idea into the torsion‑point attack framework outlined above, then briefly discuss the differences to the published attack descriptions.
In a nutshell, the core idea behind the new attacks is that considering products of elliptic curves provides us with a lot more flexibility writing down endomorphisms. For example, supposing for the moment that we have no knowledge of any non‑scalar endomorphisms on \(E\), the only endomorphisms we can write down on \(E\) are scalar multiplications \([u]\), whose degrees are integer squares \(u^2\). Passing to the product \(E\times E\) helps tremendously in that we immediately acquire plenty of endomorphisms given by linear combinations between the components: Using matrix notation, the endomorphism \[ M = \begin{pmatrix}u & v \\ -v & u\end{pmatrix} \] maps \((P,Q)\in E\times E\) to \(([u]P+[v]Q, [-v]P+[u]Q) \in E\times E\). Now, a simple computation reveals that \[ \widehat M\cdot M = M\cdot \widehat M = (u^2 + v^2)\cdot\mathrm{Id} \,\text, \] where \(\widehat{\phantom m}\) is the conjugate‑transpose, which shows that \(M\) defines a \((u^2+v^2)\)‑endomorphism.6 Thus, passing from \(E\) to \(E\times E\) has rewarded us with the ability to immediately write down endomorphisms whose degrees are sums of two squares!7
As you may have guessed, there is no reason to stop here: Adding more components gives us more freedom to cover more and more degrees. The obvious question for the correct dimension to use is answered by a remarkable fact of classical number theory:
Theorem. Every positive integer is a sum of four integer squares.What this entails in our context is that the fourth power \(E^4 = E\times E\times E\times E\) has easily constructible endomorphisms of any given degree.8
With this in mind, the rest of the torsion‑point attack is (relatively) easy. Recall that we were trying to find an endomorphism \(\vartheta\) of \(E_0\) and an integer \(d\) such that \[ \tau = \varphi\vartheta\widehat\varphi + [d] \] has degree \(B^2\). Now, we would like to replace \(\tau\colon\; E\to E\) by \(T\colon\; E^4\to E^4\), in such a way that we can pull off the torsion-point attack on the four‑dimensional variety instead. To do so, we will write \[ T = \varphi\vartheta\widehat\varphi + \Delta \] where \(\varphi\vartheta\widehat\varphi\) acts diagonally on the four components and \(\Delta\) is another endomorphism. The main condition for this to work is that \(T\) must be a \(B^2\)‑endomorphism, which incurs a few restrictions on valid choices of \(\Delta\).
Here's one possible construction: Assume with some loss of generality that the starting curve has a trace‑zero9 endomorphism \(\vartheta\) of "small" degree \(C\), such that \(A^2C \leq B^2\).10 Assume further, this time without loss of generality, that \(D := B^2 - A^2C\) can be written as a sum of three integer squares,11 say \(D = u^2 + v^2 + w^2\). Now, we apply the magic trick of replacing \(E\) by \(E^4\): On this product, we can easily verify that the matrix \[ \Delta = \begin{pmatrix} u & v & 0 & w \\ v & -u & w & 0 \\ 0 & w & u & -v \\ w & 0 & -v & -u \end{pmatrix} \] defines a symmetric \(D\)‑endomorphism, such that \(T = \varphi\vartheta\widehat\varphi + \Delta\) is a \(B^2\)‑endomorphism on \(E^4\) as desired.
The rest of the attack proceeds exactly as before: Evaluate \(T\) on the \(B\)‑torsion to recover the \(B\)‑torsion part of its kernel, Evaluate \(\widehat T\) on the \(B\)‑torsion to recover the \(B\)‑torsion part of its kernel, reconstruct \(T\) explicitly from these two partial kernels, evaluate \(T-\Delta\) on the \(A\)‑torsion to recover the part of its kernel contained in the \(A\)‑torsion, and finally deduce the kernel of \(\widehat\varphi\).
The actual published attacks can be understood as somewhat more optimized implementations of this approach: Constructing an endomorphism on the "mixed" product \(E_0\times E\) is preferable to using powers of \(E\) only, as it allows for endomorphisms of the form \[ \begin{pmatrix} \alpha_0 & \widehat\varphi \\ -\varphi & \widehat\alpha \\ \end{pmatrix} \,\text, \] whose degree involves only a single power of \(A\) rather than the \(A^2\) we'd obtained above by first reducing to the special case of a secret endomorphism \(\varphi\vartheta\widehat\varphi\) of \(E\). Castryck–Decru and Maino–Martindale indeed use this product \(E_0 \times E\), which (similarly to the "old" torsion‑point attacks!) was causing the attack to run into some trouble with norm equations in the general case. The key insight from Damien Robert is that taking four copies each of \(E_0\) and \(E\) makes all these problems vanish at once, known endomorphisms or not.
The downside is that \(E_0^4\times E^4\) is not exactly a convenient object to work with: While algorithms for isogenies of abelian varieties are known whose complexity is polynomial in \((\log(q),\,\log(n),\,\ell)\) where \(q\) is the base‑field cardinality, \(n\) the isogeny degree, and \(\ell\) its largest prime factor, the complexity remains exponential in the dimension, contributing a massive constant factor to the cost.
End
There is no real take‑home message from all of this: It's just that I certainly found it helpful to view the recent developments through the lens of previous, less impactful torsion‑point attacks, and I hope others will feel the same. Who knows what other "failed" attack ideas are just one clever trick away from wreaking havoc!12
Finally, a disclaimer: I don't have a very deep understanding of isogenies in higher dimension and some things I said here might well be wrong. Proceed with caution.
- ↑ Luciano had been working with Wouter and Thomas on a related topic.
- ↑ By now, two new variants of SIDH designed to withstand these attacks have been proposed: They are based on secret degrees and masked auxiliary points.
- ↑ As is common in science!
- ↑ Non‑nontrivial attacks here refer to "pure" isogeny search, ignoring the additional input \(\varphi\rvert_{E_0[B]}\).
- ↑ I mildly dislike the name as it doesn't actually mean anything: Over a finite field, every point is torsion anyway. To me, this is fundamentally about exploiting knowledge of the restriction of a secret isogeny to a subgroup. (Indeed, note how my description of the SIDH problem above doesn't even mention the auxiliary points!)
- ↑ The notation "\(\ell\)‑endomorphism" is shorthand for an endomorphism which is a \((\ell,...,\ell)\)‑isogeny.
- ↑ It has also rewarded us with a truckload of nontrivial mathematical and algorithmic questions concerning the computation of isogenies on that product, but I digress.
- ↑
Indeed, a \((u^2+v^2+w^2+x^2)\)‑endomorphism
is given by the matrix
\(\scriptsize\begin{pmatrix}u&v&w&x\\-v&u&-x&w\\-w&x&u&-v\\-x&-w&v&u\end{pmatrix}\).
Note that dimension four is somewhat special: It is not in general straightforward, or even possible, to write down such a matrix for a given sum of squares. - ↑ The trace‑zero condition means \(\vartheta+\widehat\vartheta = 0\), which implies that the degrees of \(\vartheta\) and integer endomorphisms will combine additively.
- ↑ Note that \(A\) and \(B\) can be swapped if needed. The "old" SIKE starting curve \(E_0\colon\; y^2 = x^3 + x\) has the order‑\(4\) automorphism \(\iota\colon\; (x,y)\mapsto (-x,\sqrt{-1}\cdot y)\). The "new" SIKE starting curve \(E_6\colon\; y^2 = x^3 + 6x^2 + x\) is a \(2\)‑isogeny neighbour of \(E_0\), hence has a degree‑\(4\) endomorphism induced by \(\iota\).
- ↑ The fact that the the choice is restricted to sums of only three squares is a result of the condition that \(\Delta\) must be given by a symmetric matrix in order to turn \(T\) into a \(B^2\)‑endomorphism. Note that we can reduce to the case that \(D\) is a sum of three squares by varying \(C\), \(A\), or \(B\). [Every integer which is not a sum of three squares is of the form \(4^k(8i+7)\).]
- ↑ This seems like a good moment to recommend that readers submit all their perceived failures to a venue like CFAIL.
Thanks: To Damien Robert for pointing out that \(\Delta\) must be symmetric for this formulation of the attack to work (else \(T\) will not be a \(B^2\)‑endomorphism as I'd mistakenly claimed earlier); to Fré Vercauteren for his insightful remarks including on the sum‑of‑three‑squares situation; and to Tanja Lange for her help rectifying some instances of unclear exposition.