ECDSA White-Box Implementations: Attacks and Designs from CHES 2021 Challenge

. Despite the growing demand for software implementations of ECDSA secure against attackers with full control of the execution environment, scientiﬁc literature on ECDSA white-box design is scarce. The CHES 2021 WhibOx contest was thus held to assess the state-of-the-art and encourage relevant practical research, inviting developers to submit ECDSA white-box implementations and attackers to break the corresponding submissions. In this work, attackers (team TheRealIdeﬁx ) and designers (team zerokey ) join to describe several attack techniques and designs used during this contest. We explain the methods used by the team TheRealIdeﬁx , which broke the most challenges, and we show the eﬃciency of each of these methods against all the submitted implementations. Moreover, we describe the designs of the two winning challenges submitted by the team zerokey ; these designs represent the ECDSA signature algorithm by a sequence of systems of low-degree equations, which are obfuscated with aﬃne encodings and extra random variables and equations. The WhibOx contest has shown that securing ECDSA in the white-box model is an open and challenging problem, as no implementation survived more than two days. In this context, our designs provide a starting methodology for further research, and our attacks highlight the weak points future work should address.


Introduction
Cryptographic techniques are primarily designed to be secure in a context where the confidentiality of secret keys is ensured with black-box access to the algorithm -only inputs and outputs are available to the attacker. Confidence in security is built from detailed studies, carefully defined security notions, and security proofs. Such a strong level of confidence is now a standard expectation. However, real-life scenarios for implementations might jeopardize initial assumptions, where attackers have access to additional information via side channels (e.g., timing or power consumption) or can modify the algorithm execution and exploit faulty results. This is called the grey-box model. Developers have to put countermeasures in place to reach the originally expected security level.
In the context of mobile applications -contactless payments, cryptocurrency wallets, streaming services -or connected objects, devices often lack secure storage to protect secret keys, and their generally open execution environment exposes a large attack surface. This hostile environment is captured by the white-box model, which assumes an attacker having control of every aspect of the implementation: execution flow, memory content and addresses. The first white-box implementations were proposed in the early 2000s by Chow et al. [CEJvO02,CEJv03], and the field has continuously developed since then, with design proposals [BG03, BCD06, XL09, Kar11, DFLM18, RW19, SEL21, BCC21], attacks [BGEC04, GMQ07, WMGP07, MGH09, DWP10, DRP13, LRD + 14, AMR19, GRW20] and efforts to define security notions [SWP09,DLPR14,AABM20].
The industry shows a growing interest in white-box cryptography owing to the widespread usage of security-related applications on connected devices. The WhibOx contest, attached to the CHES conference, has been held biennially since 2017 to encourage practical experiments both from the designer and attacker perspectives. It lasts several months, inviting coders to post white-box implementations and attackers to break them. Participants can remain anonymous and silent about any detail on their work. The first two editions in 2017 and 2019 focused on white-box implementations of AES and exhibited the community's strong interest in this subject. Some candidates survived all attacks in the second edition in 2019, showing a certain maturity for this algorithm. In 2021, organisers changed the target and decided to consider the ECDSA signature algorithm, whose white-box implementation is of substantial interest to the industry but virtually lacks scientific literature.
From May 17 th to August 22 nd 2021, 97 candidate implementations were submitted for scrutiny by 37 (teams of) attackers. All challenges were broken within 35 hours, suggesting the difficulty of achieving a secure white-box implementation of ECDSA. Thus, studying the attacks would help to discern weak points inside the implementations. Besides, the analysis of the design of the most resistant challenges, which successfully defeated most attackers, would also give directions for future designs.
Contributions. In this paper, teams TheRealIdefix -who broke the most challengesand zerokey -who proposed the two winning challenges -join to present how they proceeded during the contest 1 . On the attack side, we describe a strategy to achieve efficient attacks. As reverse engineering is a time-consuming task, automated attacks are desirable. We consider different attack paths against ECDSA white-boxes: the ones inherited from traditional cryptanalysis, the extensions of attacks in the grey-box model, and the logical attacks of the software. We discuss the feasibility of automating each attack path and provide detailed information regarding which attacks succeeded (or failed) on each candidate. Our results show that, with few exceptions, it was sufficient to fully recover the secret value by these automated attacks. On the design side, we describe the methodology we used to build the two winning challenges, Challenges 226 and 227. It includes modifying the implicit framework [RVP22] originally proposed for block ciphers, applying techniques from multivariate public-key cryptosystems, and obfuscating the resulting C code with a C obfuscator. Our design thus turns the ECDSA signature algorithm into a sequence of systems of low-degree equations which are obfuscated with large affine encodings and additional variables and equations. Finally, we show how to break Challenge 226 with automated attacks and how to break Challenge 227 once reverse-engineered.
Outline. The paper is organized as follows. Section 2 outlines the rules of the WhibOx 2021 contest. Section 3 recalls the ECDSA algorithm and the state-of-the-art regarding white-box implementations. Section 4 presents the different methods that have been used by the team TheRealIdefix to break various implementations and some statistics regarding the success rate of these methods. Section 5 discloses the designs of Challenges 226 and 227 proposed by the team zerokey, and Section 6 concludes this paper.

Rules of the WhibOx 2021 Contest
Designers were required to post challenges computing ECDSA signatures on the NIST P-256 curve under a hard-coded, freely chosen key, and accepting as input any 256-bit message digest e = H(m). Notice that the cryptographic hash function H is excluded from the intended white-box implementation of ECDSA and the message m is also not provided. At the same time, attackers were encouraged to extract the private keys. In addition, acceptance of submitted implementations was conditioned on some requirements: • the public key corresponding to the embedded private key, as well as a proof of knowledge of the private key, had to be provided, • submissions had to be source code in portable C, • linking to external libraries was forbidden, except for the GNU Multi Precision library [Gt20], • the signature algorithm had to be deterministic, • the execution time was limited to 3 seconds, the program size to 20 MB, and the RAM usage to 20 MB as well.
There was an elaborate system with scoreboards to reward designers and attackers. A challenge gains strawberries as time goes by till broken. Challenges with a higher performance score (measured in terms of execution time, code size, and RAM usage) gain strawberries faster. Eventually, the challenge with the highest number of strawberries wins the competition. Accordingly, when submitting a matching private key to the system, attackers receive bananas, the number of which is determined by the number of strawberries of the challenge at the time of the break. More detailed information can be found on the contest website [CHE].

ECDSA
In 1992, Vanstone introduced a variant of DSA based on elliptic curves. The resulting public-key signature algorithm is called Elliptic Curve Digital Signature Algorithm (ECDSA) [Van92]. Its parameters are an elliptic curve E over a field F q , a point G of prime order n, and a cryptographic hash function H. The private key d is randomly drawn from 1, n − 1 , and the public key consists of the point Q = [d]G where [d]G corresponds to the scalar multiplication of the point G by the scalar d. The ECDSA signature is described in Algorithm 1, where R x and R y denote the coordinates of the point R.
Note that the key d is not the only sensitive value in that scheme. Indeed, the recovery of the nonce k allows the computation of d from the signature (r, s) and the message m: d = (ks − H(m))r −1 mod n . (1) Go to step 2 8 end 9 Return (r, s) The nonce must not only remain secret but also differ for each execution of the algorithm. Indeed, an efficient way to recover its value is to find another signature (r , s ) of a different message m = m using the same nonce, that is with k = k. In that case, we also have r = r, so the adversary may compute (2) In the black-box model, the security of ECDSA is based on the difficulty of the Elliptic Curve Discrete Logarithm Problem (ECDLP), i.e., on the difficulty of computing the scalar k (resp. d) from the points G and R = [k]G (resp. Q = [d]G). To ensure that this problem is difficult to solve, there are several standards to define elliptic curves, e.g. [Loc10,Sta10,JOR11,FIP13]. However, there is a gap between the security of ECDSA in theory and that of ECDSA implementations. Many grey-box attacks have been described in the literature (see for example [FV12]). Some of them directly target the key d while others aim at recovering some information on the nonce k. As explained previously, the knowledge of the nonce allows an adversary to compute the secret key. Recovering a few bits of the nonces associated to different signatures may be enough for an attacker. Indeed, this allows the construction of a system of equations that can be solved using lattice-based algorithms [BH19,JSSS20] or Bleichenbacher's FFT-based approach [ANT + 20]. These bits could, for example, be recovered via side-channel analysis if the implementation is not protected or simply guessed if the nonce is not drawn uniformly at random. These attacks show that it is already complicated to achieve a secure implementation of ECDSA in the grey-box model, and of course, things get worse in the white-box context.

White-box Implementation of ECDSA
The white-box model assumes that the attacker has total access to the executable: he can read and modify it at will. He also has access to all the memory used during execution, so a white-box designer does not only have to protect his implementation against grey-box attacks but also against an adversary who can dump the memory and search for sensitive values such as k or d. The first technique to prevent secret data from appearing in plain was introduced by Chow et al. in [CEJv03]. Their idea is to embed the key into the algorithm, and each operation is performed with the help of look-up tables protected by carefully crafted encodings. Informally, the algorithm is split into low-level operations, and each operation op is replaced by f −1 • op • f , where f and f are bijections called respectively input and output encodings. The drawback of this technique is that the required memory drastically increases with the algorithm's complexity. Using it to secure operations as complex as scalar multiplications or inversions while remaining efficient is thus a real challenge.
Another challenge in white-box cryptography is the impossibility of relying on any external source of randomness. An attacker could simply disable such a source and fix its output to a constant value. For example, in the context of AES, this renders some countermeasures against side-channel or fault attacks based on randomization techniques completely inefficient. When one considers ECDSA signatures, disabling the source of randomness yields multiple uses of the same nonce and, thus, easy recovery of the private key, as seen in Sect. 3.1. The solution is to compute k as a function of the only source of randomness available, the input message: k = f (m). In order to maintain the security of the signature scheme, this mapping must be computationally indistinguishable from what a randomly and uniformly chosen function would return. We will see in the next section that many challenges of the WhibOx competition did not fulfill this requirement.

Breaking the Challenges
White-box implementations usually rely on encodings and other theoretically sound approaches to protect the secret values and their manipulations. It is also very often the case that code obfuscation techniques are used to make understanding the design a time-consuming and challenging task. Extensive use of such obfuscation techniques in the submitted source files causes independently reverse-engineering each challenge to be overwhelming in time. We thus focused on designing attack methods that could be efficient and easily automated.
This section looks at the different attacks that can be automated in a white-box context and gives the rationale for using and discarding them. We then present the results of applying the selected methods to the whole set of submissions.

Hooking Shared Libraries
The contest rules were a clear incentive for developers to use the GMP library for big number arithmetic operations. A first attempt to break the submitted challenges was then to search if sensitive values were manipulated in clear by the GMP library. In order to perform this automatically, our approach has been to hook the calls to GMP functions thanks to the so-called LD_PRELOAD trick.
Pre-loading is a feature of the dynamic linker on UNIX systems that allows loading a specific shared library before all other libraries linked to a given executable binary 2 . In our specific case, we built a shared library defining the same function as the GMP library (e.g. mpz_mul, mpz_mod or mpz_invert). Each of these functions simply updates a log of the given parameters before calling the real GMP function, explicitly using the dynamic linker (thanks to the <dlfcn.h> module) to ensure the correct execution of the white-box implementation. It is then only necessary to add our shared library to the LD_PRELOAD environment variable of the dynamic linker on our system before calling the ECDSA binary to have our custom functions called in place of the genuine GMP ones. The corresponding log is analysed in a second step to eventually reveal the secret key if d, k or related values such as r · d or e + r · d are found in the log. Such an approach allowed us to break 32% of the challenges.
As a side note, this technique also jeopardizes implementations relying on systemdependent random generators such as srand or mpz_XrandomX functions, or on other sources such as time.

Biased Nonces
As explained in Sect. 3.2, white-box designers usually generate the nonce k from the input. In the case of the WhibOx contest, the nonce is thus computed as a function of the hash, i.e. k = f (e). However, if the function f is not carefully selected, it could happen that the k i 's generated from different e i 's are not uniformly random.
In the worst case, we have collisions such that different hash values e 0 and e 1 produce the same nonce (k 0 = k 1 ). If such a collision occurs, one can recover the private key d as explained in Sect. 3.1. Furthermore, collisions can be efficiently detected by looking at the r part of the signature. To efficiently browse a subset of hash values in search of such collisions, we limited ourselves to hash values with a Hamming weight equal to 1 or 2. We thus considered 32 896 hash values and were able to break 60% of the challenges with this technique.
In those cases where we did not find any collision, we looked for biases in the nonce generation. We used well-known lattice attacks derived from [NS03] and [FGR13] to exploit such a potential weakness. Such attacks can recover an ECDSA private key only with the knowledge of a few bits of the ephemeral keys of several signatures.
A concrete example showing why such techniques can succeed in our context consists in considering f = Id. Then k i = e i and with providing e i ranging from 0 to 99 we obtained 100 signatures for which the 249 most-significant bits of the nonces are 0. This bias is more than enough for a lattice attack to recover the private key d.
Lattice-based attacks can also be applied when the ephemeral key is the product of a small random κ by another (large) constant scalar t. Such a design allows to efficiently perform the scalar multiplication as R = [κ]T = [k]G, with T = [t]G a precomputed value. The point is that the small size of κ reduces the cost of the scalar multiplication.
To sum up, the relations we used for our lattice attacks are the following (with e i ranging from 0 to 999): • assuming l = 6 known most-or least-significant bits of the ephemeral key: with L = 256 − l for the MSB case and L = l for the LSB case (we considered both cases where the known value is 0 or 63 = 2 6 − 1), • assuming the ephemeral keys are k i = tκ i : with κ i < 2 248 and t an unknown constant scalar.
Such an approach allowed us to break 72% of the challenges.

DCA
In 2016, Bos et al. showed that although firstly described for the grey-box context, the well-known side-channel attacks could be very well adapted to the white-box model. The resulting attack [BHMT16] is called Differential Computational Analysis (DCA). The principle is very similar to classical side-channel attacks: secret values are extracted from leakage traces obtained during several executions of a cryptographic algorithm with the help of statistical tools. The only difference relies upon the nature of the traces. Whereas in the grey-box context, one can record the power consumption of the device in which the algorithm is implemented, a white-box attacker can simply use software execution traces. By instrumenting the binary, he can record completely noiseless traces of all accessed addresses and data over time, leading to much more efficient attacks.
In theory, this attack is particularly devastating since it can be fully automated and does not require any earlier reverse engineering step. In practice, it is quite difficult to apply because of the size of the traces, in particular for cryptosystems such as ECDSA that have relatively long execution time. Indeed, if the whole white-box execution were to be recorded, each trace would easily reach several gigabytes. For instance, tracing n 64-bit registers on a 3GHz machine during 3 seconds would lead to a single trace of 9 * 8 * n Gb. Therefore, iterating over dozens of traces for a CPA would be overwhelming in time and memory. A time-consuming step of reverse engineering allowing to select a smaller window of the implementation before the attack is thus required, which is why we did not use this technique to break the challenges of the WhibOx contest.

Fault Injections
Another attack method is to disturb the algorithm execution and exploit the resulting faulty output. In the white-box context, faults can be easily induced since the attacker can modify the binary or use debugging tools to stop the execution and, for example, skip an instruction or modify the value of a particular register. Again, this attack can be automated and does not require an earlier reverse engineering step.
All the fault attacks that can be performed in the grey-box context are obviously also a potential threat in the white-box context. In the case of ECDSA, different faults can be induced on different variables to give an exploitable result. The most obvious attack is to force the use of a weak elliptic curve during the scalar multiplication by disturbing the curve parameters [BMM00] in order to solve the discrete logarithm problem easily. The attacker can also force the use of biased nonces, for instance, by sticking a 32-bit word of k at zero during several executions. The corresponding signatures can then be used to obtain information on the key using lattice-based algorithms. Finally, modifying one byte of d during the computation of rd may allow one to recover information on the key, as shown in [GK04].
In addition, the white-box model offers new possibilities [PSS + 18, ABF + 18, DGH21]. They arise from the fact that deterministic versions of the scheme have to be implemented due to the impossibility of relying on a source of randomness in this context. When the algorithm is used twice on the same message, the same nonce k is derived. The attacker may thus obtain a correct signature for a given digest e, and an erroneous one by modifying a second execution of the same signature. To break the challenges of the WhibOx contest, we mainly disturbed the computation of the first part of the signature r, obtaining faulty resultsr ands = k −1 (e +rd) mod n. Some secret information can be deduced from the correct and faulty signatures: Let α = kd −1 mod n. The adversary can then compute the private key: It is also possible to disturb other variables, but still, the faulty value must be known to exploit the result. Interestingly, when one modifies the first part of the signature, if no countermeasure is implemented, the faulty value is just given to the attacker as part of the output. Furthermore, the attack surface is huge: the fault may happen anywhere during the scalar multiplication. This is why we considered only this perturbation in the context of this competition. This approach is the most successful one, allowing us to break 75% of the challenges.

Attacks Results
When applying the various attack methods described above, we obtain the results presented in Table 1. We observe that lattice and fault attacks are very efficient. Collision attacks also give good results. We give in Appendix A the specific vulnerabilities of each of the 97 submitted challenges as well as the corresponding private key.
However, we noticed that many challenges had a low level of security, some of which were even plain implementations. We thus excluded 30 challenges 3 where the nonce and/or the private key were manipulated in plain. Table 2 illustrates the efficiency of the attacks presented in Sect. 4.1 on the remaining 67 challenges. We observe that hooking gives no significant result anymore, collision and lattice attacks become less efficient, and fault injection seems the most powerful attack. Among the 67 strongest challenges, Challenges 226 and 227 are the winning ones. In the next section, we present the design of these two white-box implementations.

Design of the Winning Challenges
In this section, we describe the designs of the two winning challenges of the WhibOx contest: Challenges 226 and 227. The designs of both challenges were inspired from the white-box implicit framework [RVP22], which allows encoding the whole state with large affine permutations efficiently. We implemented both challenges with the same methodology; they only differ in some additional countermeasures used.
As mentioned in Sect. 2, in the WhibOx contest, a challenge gains strawberries quadratically with time before being broken. The rule is that challenges that are either smaller, faster, or less memory-consuming gain strawberries faster. As a result, we strategically posted two challenges with different trade-offs between security level and implementation cost. Challenge 227, our lightweight variant, was the winning implementation of the contest, obtaining the highest number of strawberries (20.39). On the other hand, Challenge 226, our hardened but heavier variant, achieved second place in the contest with the second-highest number of strawberries (11.19). However, it stood unbroken for the longest time (35 hours).
Note that these challenges were specifically built for the WhibOx contest, where attackers did not know the design. Against an attacker who knows the design details, these challenges are easy to break once reverse-engineered.
This section first introduces the implicit framework, then describes the shared design approach of both challenges, and finally explains the additional countermeasures used in each challenge. For access to the underlying software used to build these challenges, please contact the authors from the team zerokey.

Implicit White-box Implementations
The implicit framework is a method to obtain a white-box implementation of a block cipher. Its main idea is to represent the round functions of the cipher by implicit functions of low degree and to protect these implicit functions with large affine encodings. Before introducing implicit white-box implementations, we need to introduce the notions of encoding, encoded implementation, and quasilinear implicit functions. While these notions are originally defined in [RVP22] for vectorial functions over the binary field, we extend these notions for an arbitrary finite field.
Let F q be the finite field with q elements. A vectorial function F from the vector space (F q ) l to (F q ) l is called a (l, l ) function over F q , and its l component functions are denoted by (F 1 , F 2 , . . . , F l ). The degree of an (l, l ) function F denotes the maximum polynomial degree of the l multivariate polynomials uniquely representing the component functions of F . Definition 1. Let F be an (l, l ) function over F q , A be an (l, l) permutation over F q and B be an (l , l ) permutation over F q . The function F = B • F • A is called an encoded function of F , and A and B are called the input and output encodings respectively.
where the input and output encodings (A (i) , B (i) ) are permutations over F q such that The first and last encodings (A (1) , B (t) ) are called the external encodings.
In this case, T is said to be quasilinear if for any (u 1 , u 2 , . . . , u l ) ∈ (F q ) l , the function The following lemma from [RVP22] describes how the composition of affine permutations translates to implicit functions. Lemma 1. Let F be an (l, l ) function over F q and T be a quasilinear implicit (l + l , l ) function of F . Let A be an affine (l, l) permutation over F q , B be an affine (l , l ) permutation over F q , and M be a linear (l , l ) permutation over F q . Then, The quasilinear property allows the implicit evaluation of F in a point (u 1 , u 2 , . . . , u l ) by solving the affine system T (u 1 , u 2 , . . . , u l , v 1 , v 2 , . . . , v l ) = 0 for the variables v 1 , v 2 , . . . , v l . We are ready to present the definition of an implicit implementation.

White-boxing ECDSA Signature Algorithm Using the Implicit Framework
In the WhibOx contest, designers submitted white-box implementations of the ECDSA signature algorithm on the NIST P256 curve. As opposed to the standard ECDSA algorithm (cf. Algorithm 1), the algorithm for the WhibOx contest (hereafter denoted by E) takes as input the 256-bit message digest. The private key is not an input of the algorithm; it is freely chosen by the designer, but it is fixed (hard-coded) in the implementation. Algorithm 2 depicts a high-level overview of this deterministic variant of ECDSA, where the deterministic nonce derivation mechanism is chosen freely by the designer. Go to step 2 8 end 9 Return (r, s) The main steps of E can be represented by the functions E (1) and E (2) . The F p -function E (1) is given by which takes as input e ∈ F p and computes the scalar multiplication R = [k]G over F p . On the other hand, the F n -function E (2) can be written as which takes as input (R x , k , e ) = (R x mod n, k mod n, e mod n) and computes (r, s) = (R x , k −1 (e + rd)) over F n . Inspired from the implicit framework, we built the white-box implementations of Challenges 226 and 227 by encoding E (1) and E (2) with affine permutations and obtaining low-degree implicit round functions of E (1) and E (2) , the encoded functions of E (1) and E (2) . We will first describe the implicit implementation of E (1) and then that of E (2) .

White-boxing the Scalar Multiplication
To build an implicit implementation of E (1) , we need first to decompose E (1) as the composition of F p -functions that we call round functions. Then we explain how to encode these round functions and how to obtain low-degree quasilinear implicit functions of the encoded round functions.
Decomposing E (1) into round functions. The function E (1) (e) = (R x , k, e), mainly consists of the scalar multiplication r = [k]G of the nonce k and the point G. For the scalar multiplication, we perform the following subroutine. First, we precompute and store a list of t random point pairs on the curve, i.e., ( Then, for each pair we select one of the two points together with its logarithm, denoted as (G i,bi , k i,bi ), where b i ∈ {0, 1} and 1 ≤ i ≤ t . We add the selected points and the selected logarithms, obtaining the scalar multiplication where k = k 1,b1 + · · · + k t,bt . This selection is done in a deterministic way depending on the bits (e 1 , e 2 , . . . , e 256 ) of the hash e, the only source of entropy in the algorithm. Moreover, the selection is done with F p -arithmetic operations rather than with conditional instructions, so that each iteration only performs F p operations. The subroutine is given in Algorithm 3.
It is worth pointing out that the values k i,j are chosen such that the sum of max(k i,0 , k i,1 ) for all i is always smaller than n. That is, we have k < n. Hence, r and s are never 0. In this way, we avoid the trivial case, i.e., avoid going to Step 7 in Algorithm 2.
By considering the precomputed points G i,j and their logarithms k i,j as fixed values and by representing the elliptic curve additions by operations over F p , we can represent E (1) given by Algorithm 3 as an iterated function over F p , that is, where each (4, 4) round function F (i) is given by the following component functions Note that the input value of F (1) is (0, 0, 0, 0), and each round function F (i) takes the hash bit e i as an additional input value. The pair of component functions F Encoding the round functions. To protect the round functions, we encode each round with random F p -affine permutations A (i) , obtaining the encoded round functions In other words, the input and output encodings of F (i) are (A (i−1) ) −1 , A (i) , and the composition of the round functions cancels all intermediate encodings except (A (0) ) −1 and A (t) , that is, where t is the number of rounds. The input encoding (A (0) ) −1 of F (1) is set as the identity mapping to preserve the input-output behaviour of E.
Obtaining the implicit round functions. Now we proceed to obtain an implicit round function T (i) of each encoded round function F (i) . To this end, we first show how to derive an implicit function of the elliptic curve addition. Let ADD(P x , P y , Q x , Q y ) = (R x , R y ) be the vectorial F p -function denoting the elliptic curve addition P + Q = R where P and Q are not the point at infinity and where P and Q have different x-coordinates 4 . In this case, R can be written as [KL14] From Eq. (14) it is easy to see that P + Q = R holds if and only if the relations hold. Note that these relations have degree 3 (degree 1 over the variables R x and R y ), while Eq. (14) has a high degree due to the inversion over F p . Thus, the function IMP(P x , P y , Q x , Q y , R x , R y ) = (IMP 0 , IMP 1 ) defined by is a quasilinear implicit round function of ADD with degree 3, assuming none of the points is the point at infinity and assuming the x-coordinates of the points are different. From the above implicit function of the elliptic curve addition, it is easy to derive a quasilinear implicit function T (i) of each round function F (i) . Then, we sample a linear permutation M (i) for each round i, and by Lemma 1 the function is a quasilinear implicit function of F (i) for 1 ≤ i ≤ t. The white-box implementations of Challenges 226 and 227 contain this implicit implementation of E (1) , with underlying encoded implementation E (1) , given by the t implicit round functions {T (1) , . . . , T (t) } in Eq. (18). Moreover, E (1) is evaluated in our white-box implementations by implicitly evaluating the encoded round functions F (i) . In other words, given the output u of the round i − 1, the output v of the ith round is computed by finding the solution of the affine system T (i) (u; v) = 0 for v.

White-boxing the Computation of s
Now we turn our attention to E (2) , the second step of the signing algorithm, where we compute r = R x mod n and s = k −1 (e + dR x ) mod n, and output the signature (r, s). As opposed to E (1) , we do not decompose E (2) but build a single (vectorial) quasilinear implicit function of E (2) = E (2) • (A (t) ) −1 , the encoded version of E (2) .
The vectorial F n -function T (t+1) defined as is a quasilinear implicit function of E (2) . In other words, the polynomial system if and only if T (t+1) (R x , R y , k, e; s, r) = 0. Moreover, the system is affine in r and s, so after plugging in values for R x , R y , k and e, the system can be solved for r, s efficiently.
The encoded version E (2) gets as input u = A (t) (R x , R y , k, e), where A (t) is the affine function that protects the last round of E (1) . By Lemma 1, we build the implicit round function of E (2) as where (A (t) ) −1 is the inverse of A (t) mod n, and where M is a random invertible 2-by-2 matrix mod n. The function T (t+1) is quasilinear, and we can implicitly evaluate E (2) on input u = A (t) (R x , R y , k, e) by plugging u in the first slot of T (t+1) and solving the remaining system (which is affine) for r and s over F n . However, the fact that E (1) works in F p while E (2) works in F n causes a problem. The input to E (2) is u = A (t) (R x , R y , k, e) reduced by mod p, so (A (t) ) −1 (u) is in general not equal to (R x , R y , k, e) mod n if there are overflows in the computation of u. Let o be the vector of overflows mod p, such that where L t is the linear part of the affine map A (t) (i.e., A (t) (x) = L t (x) + c for some constant term c).
To deal with this problem, we correct for the overflow mod p by guessing the overflow vector o and setting u = u + po before plugging u into T (t+1) (u; s, r) to solve for (r, s). If the guess is correct, then u is equal to A (t) (R x , R y , k, e) over the integers, so the correct r, s will be recovered. Therefore, we repeatedly run the last step with random guesses of o to get a candidate signature (r, s). Then we run the verification algorithm on (r, s) and output the first (r, s) for which the verification algorithm succeeds. Note that we do not need to protect the verification algorithm because it does not use secret information.
If A (t) was a random affine map with entries of size up to p, then guessing o correctly would be very unlikely. Therefore, we choose the affine map A (t) with small entries. For example, we could use With this choice, the weight of each row is four, so there are at most four overflows mod p in each entry of u, which means o can be guessed more easily. Not all guesses are equally likely, (e.g., o = [4, 4, 4, 4] only occurs if R x , R y , k, e are all quite big, which is unlikely). Rather than inefficiently guessing o ∈ [0, 4] 4 at random, we precompute a list of guesses L ordered from more likely to be correct to less likely, and we iterate through the list of guesses in that order.
The white-box implementations of Challenges 226 and 227 contain the implicit function T (t+1) , which allows the implicit evaluation of E (2) , together with the correction for the overflow mod p described above and summarized in Algorithm 4.
Note that the severe restriction on the size of the entries of A (t) makes the conversion from F p to F n one of the most vulnerable points in the white-box implementation. In particular, an attacker knowing the specifications of the design can easily recover A (t) by exhaustive search if no additional countermeasures are used.

Additional Countermeasures
The representation of the implicit round functions as systems of multivariate polynomials allows applying countermeasures from multivariate public-key cryptosystems. In fact, Challenges 227 and 226 only differ in the additional countermeasures used. In particular, we considered two techniques. First, we obfuscated the components (seen as polynomials) of the implicit round functions T (i) by multiplying them with random polynomials in the input variables. Note that the multiplication of input variables preserves the quasilinear property. Moreover, the image of a random polynomial is non-zero with high probability, and multiplying an equation with a non-zero value does not change its solution set. In the unlikely case that one of the added polynomials vanishes, the output of the corresponding implicit function will be invalid, and no valid signature will be obtained. To prevent this extreme case, we made the first implicit round function dependent on an initial value; if no valid signature is found, we simply repeated the whole process with a different initial value. This first technique increases the degree of the implicit round functions, significantly increasing the implementation size. Thus, for the lightweight Challenge 227 we only applied this technique to raise the degree of the components to the total degree of the functions, but for Challenge 226 we multiplied with polynomials of higher degree to increase the total degrees of the implicit round functions. The final degrees are listed in Tables 3 and 4.
The second technique we used was adding additional variables and components to the implicit round functions but preserving the input-output behaviour of the underlying encoded round functions.
In particular, to avoid the bias in the most significant part of the nonce k due to the constraint k < n (see Section 5.2.1), we duplicated the nonce variable and its equations so that E (1) outputs an additional nonce variable k similarly to k, and E (2) uses the sum of the nonce variables k + k as the final nonce. On top of that, instead of e, the input L(e) is given to E (1) for some hard-coded low-degree encoding L, and its inverse L (−1) is composed to E (2) to recover e. Note that this is a minor trick since the encoding L is not merged or composed with other functions (as opposed to the other encodings A (i) ), and the computation L(e) is done in clear.
Since adding additional variables and equations also introduces significant overhead in the implementation size, we only applied the second technique to Challenge 226. In particular, we added two variables and two equations in the implicit round functions of E (1) , and two variables and one equation in those of E (2) .
We also used Tigress [Col] for both challenges to obfuscate the C source code. Tigress is an obfuscator for C language that protects programs against dynamic and static reverse engineering attacks. We used the transformations 5 Flatten (flattens the code to remove structured flow), AntiTaintAnalysis (disrupts tools that make use of dynamic taint analysis), AddOpaque (adds opaque predicates), EncodeLiterals (replaces integers and strings with run-time expressions) and CleanUp (renames variables and functions).

Description
Following the method described in Section 5.2, we built Challenge 227 (keen_ptolemy) as a lightweight white-box implementation. The only additional countermeasures from Section 5.3 included in Challenge 227 are the degree increase of each component to the total degree of the corresponding vectorial function and the code obfuscation by Tigress. Challenge 227 was the winning implementation of the WhibOx contest; it achieved the highest number of strawberries (20.39) and stood for 33 hours as the second-longest. Table 3 describes the memory complexity of Challenge 227 (after applying the additional countermeasures) by describing {T (1) , . . . , T (t) } and T (t+1) , the implicit round functions of E (1) and E (2) respectively. The number of coefficients in Table 3 denotes the maximum number of non-zero coefficients of a quasilinear vectorial function with a given number of input variables, components, and degrees. If each coefficient is represented with 256 bits, {T (1) , . . . , T (t) } and T (t+1) require in total roughly 4 MB.
After obfuscating the code with Tigress, the size of the final C source code of Challenge 227 is 4.4 MB. In a modern personal laptop with the environment 6 provided by the competition, the size of the compiled binary is 4.42 MB, and the average running time and RAM consumed are 0.04 seconds and 6.14 MB respectively. The code obfuscation did not impact the running time but increase the binary size by 8% and the average RAM by 3%.

Security Analysis
Challenge 227 can be broken in several ways. Here, we explain how the attacks of Sect. 4 allow one to recover the secret key of Challenge 227 or why they do not work.

Hooking shared libraries.
During the implicit evaluation of E (2) for the valid input u = A (t) (R x , R y , k, e), the affine system T (t+1) (u; s, r) is solved for r and s. By denoting the entries of M as M = m 0 m 1 m 2 m 3 , it is easy to see that this affine system is given by the equations where the coefficients c i are given by We stress that k, e, and R x do not appear in the clear. They are expressed as linear combinations of the input u = A (t) (R x , R y , k, e) of E (2) . Nevertheless, the coefficients c i are operated in the clear during the Gaussian elimination, and the adversary can obtain their values.
Some of these coefficients are sensitive. In particular, if the attacker manages to find c 1 during the computation of two different signatures (r, s) and (r , s ), he may solve the following system of two equations with two unknowns (m 0 and d) in F n : Therefore, recovering the value c 1 = m 0 k mod n for two different signatures allows an attacker to compute the private key. Finding the interesting values inside the white-box may seem difficult without a reverse engineering step. However, the attack turns out to be easily automated on challenges that use the GMP library, such as Challenge 227. Indeed, one of the coefficients of s, say c 1 = m 0 k, will be inverted modulo n during the resolution of the system, and finding this inversion is easy when one can simply trace the calls to the function mpz_invert(). The team TheRealIdefix was thus able to efficiently apply this attack on Challenge 227 during the contest without any reverse engineering step.
This attack may seem very specific, but multiplying the nonce with a constant may appear as an easy way to protect the inversion step, and could very well be used by designers. This attack shows it is not a robust countermeasure.

Biased nonces.
There exists a more generic way of breaking Challenge 227. Indeed, the way the ephemeral key is constructed (see Sect. 5.2.1) opens the way for an attack using lattice reduction techniques.
Given that the ephemeral key k is obtained by summing 256 scalars k i,j according to each bit of the input, one can obtain the following signatures by selecting a couple of hashes (e 0 , e i ), with e 0 = 0 and e i = 2 i : which allows us to construct 256 equations involving only one of the k i,j : Now, the additional constraint k < n lets us estimate that each k i,j is sampled from 0, n/256 . Consequently, with |y| n := min a∈Z |y − an| to denote the distance of y ∈ R to the closest integer multiple of n.
We recognize in Eq. (28) an instance of the Hidden Number Problem (HNP) [BV96]. Indeed, we are given many HNP inequalities of the form: with t i = s −1 i r i − s −1 0 r 0 , u i = s −1 0 e 0 − s −1 i e i and the hidden number α is the private key d. Solving HNP instances in the context of ECDSA given inequalities such as Eq. (29) has been described numerous times in the literature. We refer the reader to [JSSS20] for a more detailed description 7 . In particular, the authors detail the reduction of the HNP instance to a Closest Vector Problem instance in a specific lattice as well as the construction of this lattice.
Finally, we use 75 relations such as Eq. (28) (out of the 255 we can establish) to build a lattice whose reduction allows us to recover the private key d.

DCA.
As explained in Sect. 4, we did not mount this side-channel attack during the contest. With the design of Challenge 227 in hand, we can see that it would have been unsuccessful, at least at the first order, thanks to the linear masking scheme used to protect all the implementation.

Fault injections.
None of the faults injected on Challenge 227 during the contest were exploitable. This can be explained by the presence of the signature verification that is used to check if the guess for the overflow between E 1 and E 2 is correct. If a fault is induced, the signature is rejected and recomputed.
Of course, a reverse engineering step could be performed to get rid of this verification, but this would be quite time-consuming. Furthermore, even without this verification, the fault attack is still not trivial to perform because of the linear masking scheme. In particular, R x is not manipulated directly in E 2 . It is expressed as a linear combination of the input A (t) (R x , R y , k, e), so modifying one of the shares would probably also fault e, k or R y , making the resulting faulty signature unexploitable.

Description
Challenge 226 (clever_kare) was the second white-box implementation that we built following the method described in Section 5.2 and including all the additional countermeasures from Section 5.3. While this challenge stood for the longest (35 hours), Challenge 226 achieved the second-highest number of strawberries (11.19) due to its higher time and memory complexity than Challenge 227. Table 4 table with Table 3.
input variables 2+6 7+6 7+5 5+2 number of components 6 6 5 2 degree 3 3 4 5 number of coefficients 37 × 6 322 × 6 854 × 5 504 × 2 The size of the final C source code of Challenge 226 is 17.54 MB, the size of the compiled binary is 15.44 MB, and the average running time and RAM consumed are 0.15 seconds and 17.27 MB, respectively. The code obfuscation did not significantly impact the performance of Challenge 226; the running time, the binary size, and the average RAM increased by less than 1%.

Security Analysis
During the WhibOx contest, the team theRealIdefix did not manage to break Challenge 226 with any of the automated attacks presented in Sect. 4.1. DCA and fault injection, which fail to break Challenge 227, are also not applicable to Challenge 226 since it is designed to be more secure. Moreover, the two attacks presented in Sect. 5.4.2 also fail to recover any secret information.
Hooking shared libraries. As mentioned in Sect. 5.3, Challenge 226 implements an additional countermeasure which consists in multiplying the components of the implicit round functions T (i) with random polynomials in the input variables. Hence, the coefficients of s in the system T (t+1) (u; s, r) for the valid input u are no longer fixed multiples of k, and the attack cannot be mounted anymore.
Biased nonces. Likewise, the additional countermeasures implemented in Challenge 226 makes the lattice attack described in Sect. 5.4.2 fail. The additional variable k alone would only reduce by 1 bit the bias observed in Eq. 29 and the attack would still be practical. However, without the knowledge of the encoding L introduced in this challenge it is impossible to exhibit such a bias leading to key recovery.
As explained, none of the automated attacks that the TheRealIdefix team used during the contest were successful on Challenge 226. Nevertheless, with the design in hand, one could easily break this challenge. Indeed, knowing that the matrix M of the last affine encoding A (t) contains small entries, the attacker could, for example: • Compute two signatures (r 1 , s 1 ) and (r 2 , s 2 ) for two messages e 1 and e 2 and extract the two valid E (2) inputs u 1 = A (t) (v 1 ) and u 2 = A (t) (v 2 ) from the execution. Note that v 1 − v 2 contains the nonce difference κ = k 1 − k 2 .
• Find κ by exhaustive search over M ; for each guess M , obtain a candidate v 1 − v 2 = (M ) −1 (u 1 − u 2 ) and check if one of its entries κ satisfies (κG) x = r 1 − r 2 .
• Solve the equation in d in order to obtain the secret key.
Therefore, this challenge can be easily broken once reverse-engineered. Nevertheless, such an attack is quite time-consuming, and resisting TheRealIdefix's automated attacks on ECDSA in the white-box contest is already an achievement considering that only 5 challenges resisted these attacks during the contest.

Conclusion
This work describes several attack techniques and designs used in the WhibOx 2021 contest. We explained the attack methods used by the team TheRealIdefix, which broke the most challenges, and we showed the success of each method against all the implementations in the contest. Fault attacks were the most efficient and effective ones; collision and lattice attacks were slightly less efficient, and hooking succeeded against weak implementations only. Among the white-box implementations that resisted these attacks, the one with the highest score was Challenge 226 (clever_kare). This challenge, together with Challenge 227 (keen_ptolemy), was submitted by the team zerokey, and they obtained the secondhighest and the highest score in the contest, respectively. In this work, we described the design methodology of these two challenges, which was inspired by the implicit white-box framework.
The large number of implementations broken by our automated attacks and the fact that no challenge survived more than two days show that securing ECDSA in the white-box model is a challenging problem. White-box attacks benefit from the huge progress in side-channel and fault attacks against ECDSA implementations, but not much research has been done on the design part. To this end, our designs provide insightful examples for future works, and our attacks highlight the weak points future research should address.
One of the main challenges specific to white-boxing ECDSA is the conversion from F p to F n . While grey-box countermeasures can protect this step (e.g., with Arithmetic-to-Boolean and Boolean-to-Arithmetic mask conversions), these techniques rely on randomness, which is ineffective in white-box implementations. In particular, the conversion from F p to F n is one of the weakest points in our designs, and further research in white-boxing the field conversion is needed. Table   Table 5 presents for each challenge submitted to WhibOx 2021 the successful attacks and the value of the corresponding key. During the contest, we broke 92 out of 97 challenges. The 5 remaining challenges have been broken a posteriori.

B Some Remarks on the Challenges
Among the various submissions, we notice the following facts: • Challenges 15 and 16 have a very small code size, only 194 bytes! To obtain such tiny implementations, the designers use a fixed nonce k = 1 (i.e. r = G x ) and a private key d such that dr ≡ 2 i mod n. In such a case, the signature of a hash e is equal to (G x , e + 2 i ).
• Challenge 114 uses a very small private key, indeed d 114 = 5 • Some designer teams modify a few bits only of the private key in several challenges (cf. Challenges 174, 185 and 187 for instance). In such a case, if one implementation is broken, then the private keys of the other challenges of the same team could be recovered by brute force search.
• Despite what is indicated in the rules (cf. Sect. 2), some challenges are not deterministic 9 , i.e. the two signatures of the same message could be different. All these challenges use the time() function to obtain some randomness. However, it is easy to hook such calls and return a constant value.