On Protecting SPHINCS+ Against Fault Attacks

. SPHINCS + is a hash-based digital signature scheme that was selected by NIST in their post-quantum cryptography standardization process. The establishment of a universal forgery on the seminal scheme SPHINCS was shown to be feasible in practice by injecting a fault when the signing device constructs any non-top subtree. Ever since the attack has been made public, little effort was spent to protect the SPHINCS family against attacks by faults. This paper works in this direction in the context of SPHINCS + and analyzes the current algorithms that aim to prevent fault-based forgeries. First, the paper adapts the original attack to SPHINCS + reinforced with randomized signing and extends the applicability of the attack to any combination of faulty and valid signatures. Considering the adaptation, the paper then presents a thorough analysis of the attack. In particular, the analysis shows that, with high probability, the security guarantees of SPHINCS + significantly drop when a single random bit flip occurs anywhere in the signing procedure and that the resulting faulty signature cannot be detected with the verification procedure. The paper shows both in theory and experimentally that the countermeasures based on caching the intermediate W-OTS + s offer a marginally greater protection against unintentional faults, and that such countermeasures are circumvented with a tolerable number of queries in an active attack. Based on these results, the paper recommends real-world deployments of SPHINCS + to implement redundancy checks.


Introduction
In 2016, the National Institute of Standards and Technology (NIST) started a post-quantum project that aimed to standardize one or more public-key cryptographic schemes in order to complement current cryptosystems (i.e., RSA and ECDSA) with quantum-resistant alternatives. Six years later, after three rounds of meticulous examination, NIST finally delivered a verdict and selected one key encapsulation mechanism along with three digital signature schemes to be standardized while four other schemes advanced to a fourth round of evaluation.
Among the digital signature schemes that were selected for standardization by NIST, SPHINCS + -a stateless hash-based digital signature scheme-was chosen thanks to its unique security assumption [GDD + 22], as the entire security of the scheme relies solely on the cryptographic properties of the hash function adopted. Such a characteristic is achieved by committing with said hash function to secret values that are revealed as part of a signature depending on the bit values of the message to be signed.
As SPHINCS + is going to be standardized, the scheme will require to operate in a real-world environment in which implementations are exposed to common abuses. In particular, faulty behaviors (e.g., data corruption) are of notable interest, since software or hardware failures may accidentally cause a cryptographic scheme to reveal information that, once disclosed, allows an adversary to compromise the security guarantees of the scheme. A classic example of a fault vulnerability is presented in [BDL97], in which Boneh, DeMillo, and Lipton show that a faulty RSA signature along with its message enables the recovery of the signing key. As such faults may occur naturally or be deliberately injected (see [BBB + 12] for techniques), studying cryptosystems in presence of errors is therefore vital to their real-world deployment.
In 2018, Castelnovi, Martinelli, and Prest have shown in [CMP18] that SPHINCS-the seminal scheme that led to SPHINCS + -was subject to a critical fault attack which was experimentally verified by Genêt et al. in [GKPM18]. The attack enables the forgery of a valid signature for any chosen message once both a valid and faulty signatures for the same message have been collected. This is because the scheme involves hash trees which are normally invariant from one execution to another and which are thus signed with one-time signatures. However, the introduction of a fault during their recomputation violates the invariability requirement that is necessary for the security of one-time signatures, as the fault forces the one-time signature to be used a second time, which goes against its intended purpose. This enables the existential forgery of a counterfeit tree that is validly signed by the compromised one-time signature and can then be used to forge a valid signature for any desired message. Since the attack exploits the design of the scheme rather than a flaw in its implementation, all existing implementations of SPHINCS + are impacted, as well as all other variations of SPHINCS. For example, Amiet et al. mounted the same attack on a custom hardware implementation of SPHINCS + in [ALCZ20].
Even though the attack critically impacts the security of the scheme, an effective countermeasure has not been discovered yet. In the work of [CMP18] by Castelnovi, Martinelli, and Prest, the authors failed to find a specific countermeasure and recommend classical redundancy instead. The work by Mozaffari Kermani, Azarderakhsh, and Aghaie in [KAA17] proposes specific error-detection mechanisms in hash function implementations which therefore do not entirely cover the SPHINCS + signing procedure, as well as a generic countermeasure based on recomputing hash trees with swapped nodes (i.e., also redundancy). In [GKPM18], Genêt et al. show that caching the one-time signatures of the hash trees in stateful hash-based signature schemes effectively protects against similar fault-based forgeries. This countermeasure prevents the recomputation of one-time signatures by storing the signatures of the hash trees that can still provide new signatures. However, the authors assert that the same countermeasure applied to stateless schemes is ineffective but do not provide any evidence for the claim. As a result, the extent to which SPHINCS + can be protected against fault attacks with this technique is unclear.

Contributions
The official specifications of SPHINCS + [HBD + 20] present two mechanisms that address fault attacks: randomizing the signing procedure, and including the public key in the signing procedure so the resulting signatures can be verified. The current paper therefore analyzes fault injections against SPHINCS + in presence of these two mechanisms. Specifically, the contributions are the following: • First, the paper starts by expanding the universal forgery with fault injections of Castelnovi, Martinelli, and Prest from [CMP18] to SPHINCS + when any types of faulty signatures are obtained. While the core of the attack is identical, the paper particularly shows that the attack is still applicable even if the adversary has collected two non-verifiable faulty signatures of the same W-OTS + keypair.
• Considering the extension, the paper presents a deep analysis of the universal forgery with a particular attention to the faulty signature collection. The analysis shows that for all parameter sets the number of queries required to circumvent the first mechanism is on average within the limit of 2 64 signatures established by NIST, and that the probability is very high that a random faulty signature is still verifiable, defeating thus the second mechanism.
• The paper then revisits the countermeasures based on caching the W-OTS + signatures in between the intermediate subtrees as suggested in previous work (see [AE17,GKPM18]) and shows that such countermeasures are ineffective, as an active adversary can always work around the caching system with a tolerable query complexity, and as a random fault still leads to an exploitable faulty signature with a marginally lower probability than without the countermeasure. This analysis is then experimentally verified on the SPHINCS + reference implementation using the ChipWhisperer framework.
As a consequence of the above points, the paper concludes that SPHINCS + is extremely sensitive to any kinds of faults, that no other current solution apart from redundancy effectively protects SPHINCS + against fault attacks, and so that all real-world deployments of SPHINCS + are recommended to implement redundancy checks to mitigate the risk. Lastly, all source code used to derive each result in the paper is made available at https://github.com/AymericGenet/SPHINCSplus-FA. The repository notably features a SPHINCS + implementation entirely developed in Python, as well as tools to mount the fault attack in practice.

Structure
The current paper is structured as follows: Section 2 gives an overview in high level of the principles of SPHINCS + . Section 3 describes the fault attack on SPHINCS + which is analyzed in Section 4. Countermeasures are discussed and analyzed in Section 5. Finally, the paper reports experimental results of the countermeasures analyses in Section 6, and concludes with a discussion in Section 7.

Background
Hash-based digital signatures are cryptographic primitives that provide authentication, integrity, and non-repudiation with the sole use of cryptographic hash functions. Three categories of hash-based digital signatures are usually identified: Because classical hash-based digital signatures require to keep track of the number of signatures used, hash-based digital signatures were considered stateful for a long time. This limitation was overcome by the scheme SPHINCS [BHH + 15], the first stateless hash-based digital signature scheme, which achieves practicality along with strong security levels by combining a large tree of MTS on top of a wide layer of FTS. Recently, SPHINCS + [BHK + 19] has been proposed to NIST's post-quantum standardization process as an improved version of SPHINCS.
This section presents a comprehensive summary of SPHINCS + and of all its components. For a thorough description of the SPHINCS + signature scheme, the reader is advised to read the full submission to the NIST standardization process in [HBD + 20].

Definitions and notations
In the original submission of SPHINCS + , different 1 hash functions are used depending on the operation. Table 1 summarizes all the hash functions and pseudorandom functions involved in SPHINCS + . Given a SPHINCS + signing key sk 1 , sk 2 , and public key pk 1 , pk 2 , the parameters column includes public and secret seeds (resp., pk 1 , pk 2 , and sk 2 ) that make hash function calls unique per key pair, and contextual information (i.e., ADRS, R, opt) that makes hash function calls unique per use. These are sometimes considered implicitly given in the notations. B denotes the set of bytes (i.e., B ∼ = {0, 1, . . . , 255}).

Function Parameters
Input Output In SPHINCS + , a (binary hash) tree refers to a structure of (hash) nodes in which an initial number of 2 x leaf values (x > 0) are compressed two by two with H until a single value-referred to as the (tree) root-is reached. The process of hashing nodes two by two until reaching a single node is referred to as treehash (sometimes called as a subroutine, by abuse of notation). An authentication path refers to the nodes that are adjacent to the ones in the path from a leaf to the root and which allow a recomputation of the root (see Figure 1 for an illustration). Finally, an address is represented by a unique bytestring (of size α = 32) which is composed of different fields, including notably the tree index that is used to uniquely address every tree involved in the scheme, and the leaf index to uniquely address all the leaves. When such fields are updated, the resulting addresses are differentiated using subscripts and superscripts (the corresponding field is understood in context).

FORS
FORS (Forest Of Random Subsets) is a few-time signature scheme which, in SPHINCS + , is used to produce the actual signature of a message digest.
A FORS instance requires the following parameters: • n : number of bytes of security. • η : number of bits in a message digest. • k : number of trees. • t = 2 a : number of leaves in a tree (of height a). Note that η = ka.
Key generation. Given a SPHINCS + signing key sk 1 and an address ADRS, the key pair (sk F i , pk F i ) of the i th FORS tree is computed as follows (1 ≤ i ≤ k): The overall key pair (sk F , pk F ) of a FORS is a collection of the keys of k trees: Signing procedure. Given a FORS signing key sk F , the scheme signs an η-bit digest md with the following procedure: Public key extraction. Given σ F = ((s (1) , auth (1) ), . . . , (s (k) , auth (k) )) bound to a known message md, the public key of the corresponding FORS can be extracted with the following procedure: 1. Split md into chunks (m 1 , . . . , m k ) of a bits. 2. For each chunk i ∈ {1, . . . , k}, recompute the public keys pk F i of the i th FORS tree from F(s (i) ) and auth (i) . 3. Return pk F = T k (pk 2 , ADRS)(pk F 1 , . . . , pk F k ). The public key extraction requires a total of k(a + 1) + 1 hash calls.

W-OTS+
W-OTS + (Winternitz One-Time Signature) is a one-time signature scheme which, in SPHINCS + , is used to authenticate subtrees by signing their roots (each time with a unique key pair).
W-OTS + uses a chaining pseudorandom function C i (pk 2 , ADRS)(x) which consists of consecutively applying F a specified number of times i ≥ 0 on an initial input, as illustrated in Figure 2. The position of an element y in the chaining function is referred to as the number i such that C i (pk 2 , ADRS)(x) = y. Such a position is updated in ADRS at each hashing step (denoted by ADRS (i) where 0 ≤ i < W − 1).
A W-OTS + instance is parameterized with: • n : number of bytes of security. • ω : a (short) window of bits signed at a time. • W = 2 ω : the length of the hash chains. Figure 2: The chaining pseudorandom function.
Signing procedure. Given an XMSS signing key sk X , the scheme signs an n-byte digest msg at leaf index 1 ≤ λ ≤ 2 h ′ with the following procedure: 1. Use sk W λ to produce σ W λ ; the W-OTS + signature of msg. 2. Compute the authentication path auth λ from the leaf index λ. 3. Return σ X = (σ W λ , auth λ ). Note that each leaf index can be used to sign at most one message.
Public key extraction. Given σ X = (σ W λ , auth λ ) bound to a known digest msg at leaf index 1 ≤ λ ≤ 2 h ′ , the public key of the corresponding XMSS can be extracted with the following procedure: 1. Extract the W-OTS + public key pk W λ from σ W λ using the message msg. 2. Recompute the XMSS public key pk X from T ℓ (pk 2 , ADRS λ )(pk W λ ) and auth λ . 3. Return pk X .
The public key extraction requires ℓ(W − 1)/2 + 1 + h ′ hash function calls on average, since the procedure depends on the extraction of a W-OTS + public key.

Hypertree
In the context of SPHINCS + , a hypertree consists of a tree of XMSS key pairs in which the XMSSs above sign the XMSSs below (with respect to a tree with the root at the top).
A hypertree is parameterized with a height h ′ , where • h : total height of the hypertree, • d : number of layers in the hypertree.
Key generation. Given a SPHINCS + signing key sk 1 , the public key of the hypertree consists of the public key of the top-most XMSS pk X d−1 at layer d − 1 (addressed at τ d−1 = 0) generated with sk 1 : Addresses derivation. Due to their structure, the addresses of the subtrees above can be entirely derived from the address of the subtree below. Let τ i be the tree index of an XMSS at layer 0 ≤ i < d − 1, such an XMSS is signed with the XMSS at tree index τ i+1 and leaf index λ i+1 derived as follows: All addresses involved in the above XMSSs are reconstructed from these indices.
Signing procedure. Given a hypertree signing key sk HT , the scheme signs an n-byte digest r at hyperleaf index 1 ≤ λ ≤ 2 h with the following procedure: (b) Generate the XMSS key pair (sk X i , pk X i ) at the address corresponding to τ i . (c) Sign r with sk X i using λ i as leaf index to produce σ i and update r with pk X i . 2. Return σ HT = (σ 0 , . . . , σ d−1 ).
Note that each hyperleaf index should be used to sign at most one message.
Public key extraction. Given a hypertree signature σ HT = (σ X 0 , . . . , σ X d−1 ) which corresponds to a known digest r at hyperleaf index 1 ≤ λ ≤ 2 h , the public key can be extracted with the following procedure: Extract the XMSS public key pk X i from σ X i using msg at the address corresponding to τ i and using the λ i as leaf index. (c) Update r with pk X i . 2. Return the last r computed, i.e, pk X d−1 . The public key extraction requires d(ℓ(W − 1)/2 + 1 + h ′ ) hash function calls on average.

SPHINCS+
SPHINCS + is a stateless signature scheme which combines FORSs with a hypertree, as illustrated in Figure 3.
Key generation. The SPHINCS + signing key consists of two secret n-byte seeds sk 1 and sk 2 picked uniformly at random: • sk 1 is used to derive the key pairs of all the hash-based instances involved in the scheme. • sk 2 is used to choose a starting FORS in an unpredictable way.
The SPHINCS + public key pk 1 consists of the public key of hypertree as well as a public seed pk 2 that makes hash function calls unique per user.
Verification procedure. Given a SPHINCS + public key pk 1 , the scheme verifies that a SPHINCS + signature Σ = (R, σ F , σ HT ) corresponds to the message msg with the following procedure: 1. Compute (md, ADRS) = H msg (pk 1 , R)(msg). 2. Extract pk F , the public key of the FORS at ADRS, from md and σ F . 3. Extract pk HT , the public key of the hypertree at the leaf index given by ADRS, from pk F and σ HT = (σ X 0 , . . . , σ X d−1 ). 4. Return true if pk HT = pk 1 , false otherwise.

Fault attack
In their original attack in [CMP18], Castelnovi, Martinelli, and Prest present a fault attack that forces a W-OTS + key pair to sign a corrupted message by injecting a fault during the construction of any non-top subtree. Along with the valid (i.e., non-faulted) signature of the subtree, the resulting W-OTS + faulty signature is used to compromise the corresponding W-OTS + key pair under a two-message attack and provide a valid signature for another subtree for which the secrets are known. This process-similar to a tree grafting-enables the forgery of an overall signature for any message.
This section expands the fault attack from Castelnovi, Martinelli, and Prest to any combination of valid and faulty signatures obtained.

Attack preliminaries
Target. In the following, we consider a target device which runs any instance of SPHINCS + with a fixed and unknown signing key, but a known public key. Furthermore, such instance is supposed hardened with randomized signing (as described in Section 2.6) using a source of true randomness.
Adversarial model. The threat model considers an adversary who has access to a number of valid and faulty signatures (along with their messages) produced by the target device. The goal of the adversary is to forge a SPHINCS + signature that verifies any chosen message under the target device's public key.
Fault characteristics. The faulty signatures consist of outputs from the target device when a single unconstrained corruption of one-to-many bits occurs in any value involved in the entire SPHINCS + signing procedure. Such an outcome can happen due to the accidental or intentional effect of, e.g., the target device overheating [BBB + 12], voltage disturbances [BBB + 12], or row-hammer [KDK + 14]. The typical use cases where the fault model is relevant include all scenarios in which a large number of signatures may be queried, such as with embedded devices, or TLS.
Due to their significant cost compared to other instructions, the fault is further assumed to occur in a hash function call. Moreover, such a fault is supposed to cause the output of the hash function to completely deviate from its intended value and be uniformly drawn at random in the co-domain of the hash function. This is aligned with the avalanche property of cryptographic hash functions in which a single bit flip early in the procedure causes an extremely different output. Besides, even if a bit flip occurs in the output of a hash function, such a bit flip will propagate in subsequent hash function calls and eventually cause uniform outputs (unless, of course, the fault hits the output of the very last hash function call of the hash structure).

Signatures collection
In a first phase, the adversary requires to collect both valid and faulty SPHINCS + signatures from the target device.
Verifiability. Distinguishing between valid and faulty signatures is not straightforward, as faulty signatures can still verify their message under the right public key. Instead, we differentiate two types of signatures: 1. Verifiable signatures: signatures that still verify their associated message under the public key of the device.
These signatures generally correspond to valid signatures, but can also correspond to faulty signatures for which a fault occurred during the derivation of any node in an authentication path. This property enables the correct rederivation of all the subtree roots that were involved in the signature, as well as a necessarily valid top part.
2. Non-verifiable signatures: signatures that do not verify their associated message under the public key of the device.
While these signatures are necessarily faulty, there are two further distinctions of non-verifiable signatures that can be made: • Non-verifiable but correct: all W-OTS + signatures still correspond to actual W-OTS + values at correct addresses.
This type of signatures is obtained when a fault occurs on the path from the leaf to the root of a subtree. No subtree root can be recovered for sure from this kind of signature (unless the layer index at which the fault occurred is known).
• Non-verifiable and incorrect: the W-OTS + signatures do not correspond to W-OTS + values.
This type of signatures is typically obtained when the entire output is corrupted. These signatures do not divulge any information and need to be discarded.
Fault exploitability. In addition to the above nomenclature, a faulty signature is said to be exploitable when the resulting signature contains a faulty W-OTS + signature which discloses unintentional secret values of the associated W-OTS + key pair. Such an outcome occurs only when a fault hits any non-top subtree (including the ones in FORS).
An exploitable signature alone is not sufficient to compromise a W-OTS + key pair. At least one more signature of the same W-OTS + (such as the valid one) is needed as well. As a result, the next step of the attack aims to determine the compromised W-OTS + s by identifying the different signatures that correspond to a same key pair.
Compromised W-OTS + identification. Once valid and faulty SPHINCS + signatures {Σ i : 0 ≤ i < N } have been collected, the W-OTS + signatures in the SPHINCS + signatures need to be arranged by layer and address: Map all the W-OTS + signatures in Σ i to their respective layer and ADRS.
If two or more different W-OTS + signatures are mapped to a same ADRS at the end of the arrangement, then the corresponding W-OTS + key pair is said to be compromised. In this case, such collection of W-OTS + signatures is referred to as the faulty W-OTS + signatures and are denoted by (σ (i) ) M i=0 , while their respective full SPHINCS + signatures are denoted by (Σ (i) :σ (i) ∈Σ (i) ) M i=0 . We denote their layer index 2 by l * ∈ {0, . . . , d}, and denote their address by ADRS * . Finally, we refer to all layers below (resp. above) the faulted layer as the bottom part (resp. as the top part) of the hypertree, as illustrated in Figure 4.

Faulty signatures processing
The next step processes the faulty SPHINCS + signatures identified in the previous section to extract the information that enables the universal forgery.

Secret values identification.
As the elements in a W-OTS + signature correspond to secret values associated to chunks of ω bits (see Section 2.3), the following process aims to identify the value of the chunks that are associated to each element.
Such a process depends on the types of signatures obtained: • Case 1: At least one verifiable signature is available.
Given a verifiable signature, the correct public key of the compromised W-OTS + can be extracted from the SPHINCS + signature (see Section 2.3). Note that the integrity of the extracted public key must be preserved even when its corresponding subtree was faulted, as the signature verifies the extracted key under the correct SPHINCS + public key.
The extracted W-OTS + public key can then be used to identify all the secret values in the other signatures, including the non-verifiable (but valid) ones. Strictly speaking, given the W-OTS + public key pk W = (p 1 , . . . , p ℓ ) and any type of W-OTS + signaturê σ W = (σ 1 , . . . ,σ ℓ ), the secret values are identified with the following exhaustive search: If no value leads to the W-OTS + public key element, then theσ W is incorrect.
Complexity. Extracting the public key of the compromised W-OTS + is equivalent to running a truncated SPHINCS + verification procedure with l * layers (see Section 2.6), which therefore amounts to an average number of hash function calls of: j that corresponds toσ j requires W − 1 − x applications of F for a hypothesized initial position 0 ≤ x ≤ W − 1 until the resulting value equals p i . As each value occurs with probability 1/W , the average number of hash function calls is: As there are ℓ blocks in eachσ W , the overall number of hash function calls for this case is ℓ(W − 1)/2.
• Case 2: Only non-verifiable signatures are available.
Since none of the subtree roots can be recovered for sure, the adversary cannot extract the compromised W-OTS + public key from the non-verifiable signatures. However, the adversary can determine the positions of each W-OTS + value by using one value as a reference for the other.
In other words, given a pair of different W-OTS + signatures, i.e., (σ (0) ,σ (1) ) wherê There are two possibilities: The above property enables confirming guesses on u and v, which directly leads to the i th ω-bit chunk of both roots, since the hash applications use different addresses at each step of the chaining pseudorandom function. As a result, the values are extracted as follows: 1. Create the next ω-bit chunks u, v respectively corresponding toσ If no such u and v exist, then at least one of the signatures is incorrect.
Complexity. Supposing that all chunks are uniformly distributed, the probability thatb − 1)). Since, for a fixed u, the exhaustive search on v can apply F on the previous hash result, the average number of hash calls is: i : in this case, if both signatures are correct, then the two values must correspond to the same ω-bit chunk, but of unknown position in the chaining pseudorandom function. Another signature with a different value at index i is required to identify the value of the chunks.
If there are still chunks of unknown positions at the end of the secret values identification, such positions can be retrieved while extracting the top part of the SPHINCS + signature (see below).
An illustration for the two possibilities forσ in a chaining pseudorandom function is shown in Figure 5. The above process is applied to all faulty W-OTS + signatures in order to retrieve as many secret values as possible to forge a signature for a variety of different ω-bit chunks.
Note that since the values at lower positions in the chaining function enable the recomputation of values at higher positions, the identification of W-OTS + secret values can keep track of the values at the lowest positions only. As a result, in the following, we refer to the lowest positions learnt by the secret extraction as the most secret elements which are denoted by (θ 1 , . . . ,θ ℓ ) and are respectively associated to the ω-bit chunks of (b 1 , . . . ,b ℓ ).
Top part extraction. Extracting a valid top part is required so that the verification of the forged SPHINCS + signature leads to the target device's public key.
The extraction considers the case in which multiple top parts are available due to our fault model. Under these circumstances, all the top parts available need to be tried out starting from the compromised W-OTS + public key until one leads to the target device's public key.
In case there were still ω-bit chunks of unknown values at the end of the secrets extraction, such chunks may be identified during this part by guessing all of the unknown chunks at once, deriving the corresponding W-OTS + public key, and trying this public key with the above steps. Such a process both confirms the values of the unknown chunks, and extracts a valid top part of the signature.
Complexity. Verifying that a selected top part is valid requires a truncated SPHINCS + verification procedure starting from the compromised W-OTS + . Along with all the top parts available, there may be chunks that need to be exhaustively searched in case no verifiable signature was available (see above). Supposing that the blocks are uniformly distributed, the probability that, for a fixed index 1 ≤ j ≤ ℓ, all W-OTS + elementsσ (i) j are the same is 1/W M −1 . Therefore, on average, the number of chunks of unknown value is: Given pk W = (p 1 , . . . , p ℓ ), each trial requires one application of T ℓ and the recomputation of the root of the XMSS right above the faulted layer, as well as the full public key extraction of (d − 1) − l * XMSS public keys. Therefore, the average number of hash function calls is:

Tree grafting
Once the most secret values of a compromised W-OTS + key pair were successfully extracted from the faulty signatures, the adversary aims to graft a subtree (or a forest) to the extracted top part, i.e., find another XMSS (or FORS) for which a valid W-OTS + signature can be forged in order to spoof the compromised instance at its own address. During this step, the adversary attempts to sign the root of a forged FORS or XMSS with the W-OTS + secret values at disposal. Let (θ 1 , . . . ,θ ℓ ) be the most secret W-OTS + values extracted which correspond to the ω-bit chunks (b 1 , . . . ,b ℓ ), the grafting procedure repeats the following until successful: 1. Draw sk ′ ∈ B n uniformly at random. 2. If l * = 0, create a FORS of public key r ′ with sk ′ at ADRS * (see Section 2.2), else, create an XMSS of public key r ′ with sk ′ at ADRS * (see Section 2.4). 3. Split r ′ and its checksum into Once found, the secret key of the grafted subtree is sk ′ and its signature is: Complexity. The tree grafting depends on the layer hit: • In case a FORS needs to be forged (l * = 0), the public key derivation amounts to k generations of FORS trees, each of them requiring a treehash procedure of height log 2 (t) = a, in addition to a final application of T k with all the FORS tree roots: • In case an XMSS needs to be forged (1 ≤ l * ≤ d − 1), the public key derivation amounts to 2 h ′ W-OTS + public keys generation and a treehash procedure of height h ′ : The probability that one attempt is successful is given by an extension of the work of Bruinderink and Hülsing in [BH17]. Given M > 1 different W-OTS + signatures, supposing that the chunks (b 1 , . . . , b ℓ ) are uniformly 3 distributed, each chunk x occurs with probability 1/W and enables the forgery of all chunks from x to W − 1. Thus, the overall probability that the root of a forged XMSS can be signed is:

Path seeking
The SPHINCS + signing procedure follows a path in the hypertree depending on the message and a value R. As a result, the adversary requires to find an adequate value R that makes the forged signature visit the compromised subtree. Straightforwardly, given the message msg' to be maliciously signed, the value R is brute-forced until the corresponding tree index at layer l * is the same as the grafted subtree: 1. Draw R ′ ∈ B n uniformly at random. 2. Check that the hypertree leaf index in (_, ADRS) = H msg (pk 1 , R ′ )(msg') leads to the tree index of the grafted subtree (see Section 2.5).
While a single grafted subtree allows the adversary to forge valid SPHINCS + signatures for as many messages as desired, note that path seeking depends on the message and therefore needs to be repeated for each new message.
Complexity. Finding R ′ is equivalent to an exhaustive search of n bytes such that the h − h ′ l * most significant bits of the tree index give the index of the grafted subtree (see Section 2.5). Each trial requires only a single hash function application, and its probability of success is simply 2 −(h−h ′ l * ) . Consequently, the adversary requires 2 h−h ′ l * hash function calls on average to find an appropriate value for R ′ .

Universal forgery
Piecing everything together, the adversary uses the grafted subtree and the value R ′ to forge a bottom part of the signature, then plugs the extracted top part onto the forged part to craft a valid signature for the malicious message (selected in Section 3.4). The procedure goes as follows: 1. Generate arbitrary key pairs to forge (σ ′F , σ ′X 0 , . . . , σ ′X l * −1 ), i.e., all the signatures in the layers below the grafted subtree (see Section 2.2 and Section 2.5). 2. Sign σ ′X l * −1 with the grafted XMSS at address ADRS * using sk ′ (see Section 2.4). 3. Copy the top part for the rest of the signatures.
The final signature that verifies msg ′ under the device's public key is therefore: ). 3 We call attention to the fact that the uniform hypothesis of the blocks (b ℓ 1 +1 , . . . , b ℓ 1 +ℓ 2 ) is not rigorous as these blocks are actually sums of uniform random variables. However, simulations in [CMP18,GKPM18] show that such a discrepancy is tolerable for our use cases.
The average computational complexity of each step in the universal forgery is shown in Table 2 for all SPHINCS + parameters sets. These numbers suggest that the fault attack is feasible in all scenarios, although the number of required hashes varies significantly depending on the specific layer targeted by the attack. However, even though the reported numbers seem high, the overall number of required hashes can still be attainable in practice 4 . This result is especially important as the fault attack can therefore be successful even if the fault is uncontrolled. The latter will be analyzed in the next section.

Attack analysis
This section analyzes the fault attack described in Section 3.

Fault analysis
Since our fault model considers that faults only affect the results of hash functions, the following counts the number of hash function calls in the entire SPHINCS + signing procedure to determine the proportion of calls that, when faulted, lead to an exploitable or a verifiable faulty signature.
-A non-verifiable but correct signature is obtained when a fault hits any value on the path from a leaf to the root of a FORS tree. The values in a path consist of the secret leaf derivation, in addition to a single node in all levels of a tree, and the computation of the FORS public key. As there are k trees of t = 2 a leaves, this amounts to a total number of non-verifiable but correct signatures of: 1 + 1 = k(a + 2) + 1.
• Signature verifiability: The verifiability of the resulting signature depends on the location of the fault in the subtree: -A verifiable signature is obtained when a fault hits any value involved in the authentication path of a non-top XMSS. Such an authentication path starts with the derivation of 2 h ′ − 1 W-OTS + public keys, as well as the computation of 2 h ′ −i − 1 nodes at each level 1 ≤ i ≤ h ′ of the subtree. Every W-OTS + public key requires the derivation of ℓ secret values; each of them chained W − 1 times with the chaining pseudorandom function, so that all the results can be compressed with T ℓ . This amounts to a total number of verifiable signatures of: -A non-verifiable but correct signature is obtained when a fault hits any value on the path from a leaf to the root of a non-top XMSS. The values in a path consist of a single W-OTS + public key, in addition to a single node in all levels of a tree. As above, the W-OTS + public key requires the derivation of ℓ secret values; each of them chained W − 1 times with the chaining pseudorandom function, so that all the results can be compressed with T ℓ .
Since there are h ′ levels, this amounts to a total number of non-verifiable but correct signatures of: 5. XMSS signature at the top layer (i.e., l * = d).
• Signature verifiability: The resulting signature is non-verifiable but correct, as the reconstruction of this XMSS does not lead to the SPHINCS + public key. All the W-OTS + signatures involved are valid, however no valid top part can be extracted.
Summing up the hash function calls of all the components above, the grand total of hash function calls in a single SPHINCS + signature is therefore given by:  Suppose that a fault can hit any hash function call uniformly at random. The above enumerations lead to the following probabilities: Fault exploitability. The probability that the faulty signature is exploitable is given by the proportion of faulty signature outcomes that leads to an exploitable signature:

Fault verifiability.
Similarly, the probability that the faulty signature is verifiable is given by the proportion of the faulty signature outcomes that leads to a verifiable signature: Layer hit. Let L = l * denote the event that a fault has affected σ X l * (i.e., that a hash function call in the layer l * − 1 is hit by a fault). The probability that L = l * is therefore given by the total number of hash function calls at layer l * − 1: Table 4 computes the above probabilities given all SPHINCS + parameters sets. This table shows that the probability that a random fault leads to both an exploitable and a verifiable faulty signature is high.

Universal forgery analysis: one-fault model
This section analyzes the use case where the adversary has access to many valid signatures (i.e., M v > 1) but only a single faulty one (i.e., M f = 1) which is supposed exploitable and which corresponds to layer 0 ≤ l * < d. Let N = 2 h−h ′ l * be the total number of W-OTS + key pairs on layer l * .
Collecting the corresponding valid signature. The probability that the valid signature corresponding to the same key pair as the faulty signature is included in the collected M v signatures is simply given by: Alternatively, the expected number of valid queries to obtain the corresponding valid signature is given by a geometric random variable with probability 1/N : Table 5 computes the expected numbers of valid queries to obtain in order to mount the universal forgery for each SPHINCS + parameters set. The average number of queries required to mount the forgery is in most cases lower than NIST's security definition for digital signatures of 2 64 (see [NIS16]).

Universal forgery analysis: multiple-fault model
This section analyzes the use case where the adversary has access to multiple valid and faulty signatures (i.e., M v > 1, M f > 1) which are all supposed to be exploitable, different, and which all correspond to the same layer 0 ≤ l * < d. Let N = 2 h−h ′ l * be the total number of W-OTS + key pairs on layer l * . Also, let a b denote the Stirling number of the second kind which counts the number of ways to distribute a objects into b non-empty subsets.

Faulty signatures collision.
As the universal forgery can be mounted with only faulty signatures, the probability that two faulty signatures correspond to the same W-OTS + key pair is an instance of the birthday paradox [FGT92]:

Pair of valid and faulty signatures.
Combining the faulty signatures with the valid ones, the probability that a faulty signature corresponds to the same W-OTS + key pair as a valid signature is an instance of the occupancy problem with two types of balls [NS88]: Table 6 computes the above probabilities with N = 256 (i.e., when l * = d − 1 for the 128s, 192s, and 256s parameters sets of SPHINCS + , or l * = d − 2 for SPHINCS + -256f). This table shows that the randomness plays in the favor of the adversary, as only very few faulty queries are required to break a W-OTS + . This number drops even lower when combined with very few valid queries. Increasing the numbers of faulty signatures. While a single pair of different W-OTS + signatures corresponding to a same key pair is enough to mount the universal forgery, the grafting step becomes easier the more faulty W-OTS + signatures are obtained for a same key pair (see Section 3.3). In order to study this, notice that collecting M f faulty signatures from N key pairs can be modeled as a multinomial distribution with uniform probabilities (i.e., p k = 1/N for 1 ≤ k ≤ N ).
The probability that at least one W-OTS + key pair has been reused c times is an instance of the maximal frequency in a multinomial distribution. Let s k determine the accumulated number of W-OTS + signatures counting from the first W-OTS + key pair to the k th key pair (so s 0 = 0 and s N = M f ). Then, from the analysis by Corrado in [Cor11], the transition probability from s k−1 to s k is given by: . Given the above probabilities, the stochastic matrix that determines the transitions from s k−1 to s k is defined as follows: LetQ k be the result of culling the transition probabilities that assign more than c signatures to a key pair (i.e., by setting P(s k − s k−1 > C) = 0 for the relevant s k , s k−1 ) and letQ (1) 1 be the first row ofQ 1 . The probability that the maximum load is no more than c is given by the transition from s 1 to s n which is determined by the following product of stochastic matrices: Alternatively, the expected maximum load given M f faulty signatures is: Combining this result with the valid signatures, notice that the maximum load is increased by one by collecting the valid signature of the W-OTS + for which the maximum load is reached. As such an event can be modeled as a Bernoulli random variable with probability 1 − (1 − 1/N ) Mv , we ultimately have: Table 7 computes the maximum load averages with M f signatures for the N that correspond to the few first top layers of the SPHINCS + parameters sets, as increasing the number of signatures is especially relevant when targeting such layers.
Layer coverage. The probability that the collected valid signatures cover the entire layerin which case, all valid signatures are known-is an instance of the coupon collector's problem [FGT92]: Alternatively, the expected number of valid queries to cover the entire layer is given by:

Caching countermeasures analysis
In order to prevent faulty signatures from being collected, the W-OTS + signatures computed throughout a SPHINCS + signing procedure can be cached (i.e., stored in memory, sometimes temporarily, and then retransmitted without recomputation when requested). Such a process not only prevents accidental faulty recomputations of a W-OTS + signature, but also improves the performances of the SPHINCS + signature generation. Notice also that the valid W-OTS + signatures are leakage-agnostic, so the cache can therefore be shared with verifiers (in a read-only fashion).
In this section, we consider two different strategies of caching W-OTS + s: caching layers and caching branches.

Caching layers
This strategy, originally proposed in Gravity-SPHINCS [AE17], consists of caching all the W-OTS + within one or more layers (starting from the top layer). Since the cache is not updated with new signature requests, the cache is static and can therefore be added to the public key. This strategy increases the complexity of the key generation by a factor of c i=0 2 h ′ i = (2 ch ′ +h ′ − 1)/(2 h ′ − 1).
• The new signing procedure derives the XMSS signatures for the cached layers by using the W-OTS + signatures and public keys from the cache. An n-byte digest msg at hyperleaf index 1 ≤ λ ≤ 2 h is therefore signed as follows: (b) Generate the XMSS key pair (sk X i , pk X i ) at the address corresponding to τ i . (c) Sign r with sk X using λ i as leaf index to produce σ i and update r with pk X i . This strategy decreases the signing procedure complexity of c × 2 h ′ (ℓW + 1) hash function calls.

Analysis.
While the algorithm prevents faulting c W-OTS + signatures, the new algorithm features a reduced total number of hash function calls which therefore impacts the proportion of vulnerable hash function calls, hence the chance that a random fault produces an exploitable faulty signature.
In a cached XMSS, the total number of hash function calls is: #TotalX = 2 h ′ −1 − 1. This leads to a new grand total of hash function calls in the SPHINCS + signing procedure: As a result, since a fault in an XMSS below a cached layer is not exploitable anymore, the proportion of hash function calls that lead to an exploitable faulty signature is: where 0 < c < d (P(Expl.) = 0 if c = d). Table 9 shows how the probability that a single random fault is exploitable decreases with c for all SPHINCS + parameter sets. This table shows that the probability that a random fault gives an exploitable faulty signature stays fairly high, especially for the fast variants of SPHINCS + .
In terms of memory, let C denote the total number of W-OTS + signatures cached. We therefore obtain: As a W-OTS + signature consists of ℓ elements of n bytes and since a W-OTS + public key consists of a single element of n bytes, caching c layers requires C(ℓ + 1)n bytes in total. Table 10 shows how the cost of caching layers evolves with c for all SPHINCS + parameter sets. This table demonstrates that the memory requirements for this countermeasure blows up very early and that only the first few top layers can be cached in practice.   3.43 × 10 4 5.83 × 10 5 9.36 × 10 6 1.50 × 10 8 . . . 6.75×10 23

Caching branches
This strategy consists of caching all the W-OTS + signatures and public keys in a path during a signing procedure. The cache is dynamic and may require to be updated for each new signature. As reported in [GKPM18], this strategy completely prevents similar fault-based universal forgeries in stateful hash-based signature schemes (such as XMSS MT [HRB13]). This is because the subtrees involved in stateful schemes provide only a limited number of signatures whose availability is remembered by the signer. Thus, once computed, the signature of a subtree can be retained as long as the subtree is involved in new signatures, at which point it is replaced by the next subtree in line. This prevents faulty recomputations of the signatures by caching only one W-OTS + per layer. This section shows that applying the same idea to SPHINCS + is ineffective, even when multiple W-OTS + s per layer are cached.
Algorithms. The countermeasure consists of adding a cache of size C l to each layer 0 ≤ l < d of XMSSs, where C l ≤ 2 h ′ l denotes the number of W-OTS + signatures and public keys that can be stored in the cache at layer l. The new XMSS signing procedure therefore signs an n-byte digest msg at leaf index 1 ≤ λ ≤ 2 h ′ as follows: 1. Check if the W-OTS + signature at leaf index λ is in the cache: • On cache hit, read the signature σ W λ and pk W λ from the cache. • On cache miss: (a) If the cache is full, evict the least recent signature. (b) Use sk X λ to produce pk W λ and σ W λ ; the W-OTS + signature of msg. (c) Put σ W λ and pk W λ in the cache. 2. Compute the authentication path auth λ starting from pk W λ (using cached W-OTS + public keys when accessible). 3. Return σ X = (σ W λ , auth λ ). When all caches are filled, this strategy enhances the XMSS signing procedure complexity of an average of d−1 l=0 2 h ′ (ℓW + 1)(C l /2 h−h ′ l ) fewer hash function calls. We suppose all caches empty at the device startup.

Analysis.
As not all branches of the hypertree can realistically be cached, in order for the countermeasure to be effective, we suppose that we cache only a significant ratio of a layer. We furthermore focus on the significantly cached layer, as the layers above are necessarily all cached while the layers below are only marginally covered.
A faulty signature is exploitable if the fault hits a layer for which the corresponding W-OTS + signature is uncached. As the cache is dynamically filled, the probability of a cache miss depends on the number of distinct signatures visited after M queries to the signing procedure.
Let D l denote the number of distinct visited W-OTS + signatures in layer 0 ≤ l < d after M queries, and N = 2 h−h ′ l the total number of W-OTS + signatures in such layer. Then, the distribution of D l is an instance of the occupancy problem [Fel67]: Now, suppose that a total of D l ≤ 2 h−h ′ l signatures are cached at each layer 0 ≤ l < d (after a certain number M of queries). Then, as before, the probability that a fault leads to an exploitable faulty W-OTS + signature is derived by counting the number of vulnerable hash function calls in the procedure. However, in this case, the totals of hash function calls at all layers behave as random variables which depend on the cache status of every leaf in each XMSS. So, instead of deriving the exact totals, we evaluate the following heuristic: where E(#Expl.) denotes the average number of hash function calls that lead to an exploitable faulty signature when faulted, and E(#Total) the average total number of hash function calls in a SPHINCS + signing procedure.
Starting with the average total of hash function calls, notice that only the XMSS signing procedure was changed. Supposing that the cache is uniformly filled, we obtain: P(Cache miss at layer l)(ℓW + 1) The average total of vulnerable hash function calls is determined by the average total number of hash function calls in each layer of the SPHINCS + structure. At each layer, such a number now depends on the number of W-OTS + cached on the layer, as well as the number of W-OTS + cached on the layer above. Again, supposing that the cache is uniformly distributed, we obtain: Table 11 shows how the probability that a random fault is exploitable decreases with b for all SPHINCS + parameter sets supposing that all the caches are filled to capacity (i.e., D l = min(b, 2 h−h ′ l ), so after a sufficiently large number of queries M were made). As with caching layers, since the total number of hash function calls in the entire signing procedure decreases with the number of vulnerable hash function calls, the proportion of exploitable faulty signatures stays fairly high, especially for the fast variants of SPHINCS + . Note however that such a countermeasure still leaves fewer hash function calls vulnerable than an unprotected SPHINCS + . Since the universal forgery requires at least two recomputations of the same W-OTS + signature, we study the number of queries before a W-OTS + signature needs to be recomputed (i.e., two cache misses for a same W-OTS + ). We solve this problem with a Markov chain (see, e.g., [GS97] for a reference on the methodology) as shown in Figure 6. The corresponding transition matrix P = p i,j is defined as follows (for 0 ≤ i, j ≤ N + 2): The fundamental matrix that counts the average number of discrete steps spent in each state is computed as follows: where I is the (N + 1) × (N + 1) identity matrix, and Q is the (N + 1) × (N + 1) submatrix of P without the last column and row. As we start with the cache being empty, the Figure 6: Markov chain representing the transitions from the cache being empty to any W-OTS + being recomputed. The states (others than "Recomp.") count the number of cache misses without recomputation.
expected number of queries M before a W-OTS + is recomputed is given by summing the first row of the fundamental matrix:  In terms of memory, let C denote the total number of W-OTS + signatures cached when b branches are fully cached. As C l ≤ 2 h ′ l , we have that: As with caching layers, a W-OTS + signature consists of ℓ elements of n bytes and a W-OTS + public key consists of a single element of n bytes, so caching b layers requires C(ℓ + 1)n bytes in total. Table 13 shows the cost of caching various numbers of branches for all SPHINCS + parameter sets. The memory requirements for this countermeasure blows up very early, so only the first few top layers are expected to be covered in practice.

Practical experiments
The following section aims to experimentally verify the fault attack as described in Section 3, the analysis of the fault attack from Section 4, and the analysis of the caching countermeasures from Section 5.

Setup
Hardware. As the fault attack does not require sophisticated glitching technology, our proof of concept uses the ChipWhisperer framework to perform experiments, which includes: • The Chipwhisperer-Lite Level 2 starter kit.
The DUT is configured to run at its maximal clock frequency (i.e., 180 MHz).
Software. We attack the reference implementation of SPHINCS + from [FKN + 22] which was slightly adapted to run on the Cortex-M4 of the DUT. The instance attacked is sphincs-shake-256s-robust which is claimed to achieve the maximal theoretical security guarantees. The hash function SHAKE was instantiated with a portable software implementation. For practicality purpose, the software was further modified to limit the signing procedure to the computation of a single layer. As a result, the software would use the W-OTS + keypair of an XMSS at a fixed layer 0 < l * < d to sign the XMSS root at layer l * − 1 addressed by a given index. The output signature consists of the W-OTS + signature along with the authentication path in the XMSS of layer l * − 1.
The laptop communicates with the DUT through UART and the protocol is implemented using ChipWhisperer's simpleserial library. The DUT can be commanded to: • Program the SPHINCS + secret and public seeds sk 1 and pk 2 . • Given an address, compute the W-OTS + signature and the authentication path of the XMSS at layer l * − 1. • Retrieve the bytes of the last W-OTS + signature and authentication path computed.

Fault injection.
To collect faulty signatures, the ChipWhisperer is used to inject a glitch in the system clock of the DUT. We do not synchronize the glitch injection with a trigger signal as we do not require to hit a precise instruction to collect exploitable signatures. Instead, the glitch is manually injected after a (progressive) delay that follows the communication with the DUT.
The glitch characteristics were explored experimentally to favor faulty signatures. Using a width of 20 samples and a clock offset of −4 samples, we report ≈ 1/3 of output signatures to be faulty (so ≈ 2/3 of valid outputs).

Experiment 1: randomized + cached layer
In the first experiment, we simulate the layer caching countermeasure (see Section 5) by pretending that all the W-OTS + signatures on the last layer are cached. In practice, such a cache would amount to 0.55 MB of ROM. The experiment therefore aims to show the feasibility of an attack on the second last layer (i.e., l * = d − 2 = 6).
The experiment protocol to query a signature goes as follows: 1. The laptop sends to the DUT three bytes that correspond to the XMSS address at layer l * − 1, i.e.: • τ l * −1 = the first two bytes sent.
2. The DUT computes: (a) The authentication path of the XMSS at layer l * − 1 and tree index τ l * −1 , starting from the leaf index λ l * −1 . (b) The root r of the XMSS at layer l * − 1 and tree index τ l * −1 . (c) The W-OTS + signature of r, using the W-OTS + key pair from the XMSS at layer l * , tree index τ l * , and leaf index λ l * , where: • τ l * = the first byte of τ l * −1 .
3. The laptop then retrieves the W-OTS + signature and authentication path.
The DUT takes around 79 seconds to compute a single XMSS authentication path and W-OTS + signature, during which the clock glitch is blindly injected. We conduct N = 5 trials where a single trial consists of repeating the above with a fixed SPHINCS + secret seed to collect 1,024 potentially faulty signatures.
Results. The faulty signature collection is successful across all trials, as a W-OTS + is always found to be compromised at the end of the collection. Table 14 reports the types of signatures collected during the trials which were identified by recomputing the correct W-OTS + signature and authentication path from the programmed secret seed.  Table 15 reports the results related to the universal forgery. Given the analysis from Section 4 and using M f ≈ (1/3)1024 and M v ≈ (2/3)1024, we have that a W-OTS + signature is compromised with a probability of 0.5877 using only faulty signatures, and of 0.9714 using both valid and faulty signatures. The maximum load is expected to be 1.59 + 0.01. On average, the probability that the grafting step is successful with two different W-OTS + signatures is 2 −34.85 . All these numbers are aligned with the ones obtained in practice.

Conclusion.
The experiment has demonstrated that despite the fact that the layer presents 2 16 signatures, as few as 2 10 signature queries with a fault probability of ≈ 1/3 are enough to compromise at least one W-OTS + and, therefore, mount a SPHINCS + universal forgery.

Experiment 2: randomized + cached branches
In the second experiment, we simulate the branch caching countermeasure (see Section 5) by implementing an internal cache of C addresses for which we pretend that the corresponding W-OTS + are transmitted without recomputation. When requesting a W-OTS + at a certain address, the computation is triggered only if the given address was not previously cached.
The experiment aims to show that an attack is possible even when a significant portion of the layer is cached. For practicality purpose, we target the last layer (i.e., l * = d − 1 = 7) and use a cache of size C = 171 to cover two thirds of the 2 h ′ = 256 possible addresses. In theory, such a cache would amount to 2.93 MB of RAM.
At the beginning of the experiment, the DUT's cache is empty. The experiment protocol to query a signature goes as follows: 1. The laptop sends to the DUT two bytes that correspond to the XMSS address at layer l * − 1, i.e.: • τ l * −1 = the first byte sent.
2. If τ l * −1 is in the DUT's cache, then the DUT computes nothing and the protocol stops here.
3. Else, if τ l * −1 is not cached, then the DUT saves τ l * −1 in the cache (after evicting the least recent address cached if the cache is full), and computes: (a) The authentication path of the XMSS at layer l * − 1 at tree index τ l * −1 , starting from the leaf index λ l * −1 . (b) The root r of the XMSS at layer l * − 1 and tree index τ l * −1 . (c) The W-OTS + signature of r, using the W-OTS + key pair from the XMSS at layer l * , tree index τ l * , and leaf index λ l * , where: 4. The laptop then retrieves the W-OTS + signature and authentication path.
On a cache miss, the DUT takes around 79 seconds to compute a single XMSS authentication path and W-OTS + signature, during which the clock glitch is blindly injected. The glitch is not injected on a cache hit. We conduct a total of N = 10 trials where a single trial consists of repeating the above with a fixed SPHINCS + secret seed to collect 512 potentially faulty signatures.
Results. The faulty signature collection is successful across all trials, as a W-OTS + is always found to be compromised at the end of the collection. Table 16 reports the types of signatures collected during the trials which were identified by recomputing the correct W-OTS + signature and authentication path using the programmed secret seed.  Table 17 reports the results related to the universal forgery. Given the analysis from Section 5, the number of queries before a W-OTS + is recomputed is 318.09. Using a probability of successful fault injection of 1/3, a W-OTS + is successfully compromised upon recomputation with a probability of 1 − (1 − 1/3) 2 = 0.5555. This number is aligned with the ones obtained in practice. Conclusion. The experiment has demonstrated that despite the fact that two thirds of the attacked layer are cached, as few as 2 9 signature queries with a fault probability of ≈ 1/3 are enough to compromise at least one W-OTS + and, thus, mount a SPHINCS + universal forgery.

Discussion & Conclusion
In this paper, a refined fault attack against SPHINCS + that is less restrictive than the original attack from Castelnovi, Martinelli, and Prest in [CMP18] has been presented. The complexity of the attack in terms of required queries, hashes, and success probability has also been scrupulously analyzed. Finally, the effectiveness of countermeasures based on caching both layers and branches has been shown to be underwhelming; a result which was experimentally verified. The main takeaway of the current analysis is that SPHINCS + is extremely fragile against faults. As Section 4 shows, a single unconstrained corruption of almost any computation has a catastrophic impact on the security guarantees of all SPHINCS + parameters sets. This amounts to millions of hash function calls that need to be carried out faultlessly in order to sign a single message; a number that is not considering other subroutines (such as, e.g., the checksum in W-OTS + ) which are at least equally vulnerable.
While the other post-quantum signature algorithms selected by NIST in 2022 are also susceptible to fault attacks, this vulnerability makes SPHINCS + the most sensitive candidate to faults. For example, Bruinderink and Pessl have demonstrated in [BP18] that the lattice-based signature scheme CRYSTALS-Dilithium is also vulnerable to a universal forgery using an equivalent fault model. However, the attack on CRYSTALS-Dilithium can only be mounted when an adversary obtains the valid and faulty signatures of the same message, while SPHINCS + is vulnerable even when the device signs different and uncontrolled messages. Additionally, while the authors of [BP18] suggest that verifying signatures or randomization can serve as effective countermeasures against differential fault attacks on CRYSTALS-Dilithium, both of these approaches have been shown to be ineffective when applied to SPHINCS + . The current attacks against Falcon-another lattice-based signature scheme chosen by NIST-only work when these faults result in an early abortion and zeroing of values, which requires a higher precision and more capabilities than the fault model considered in this paper (see [MHS + 19]).
Such a fragility needs to be taken seriously, as faults are reported to naturally happen in conventional hardware such as, e.g., in DRAM. For instance, Schroeder, Pinheiro, and Weber have reported 25,000 to 70,000 errors in DRAM per billion device hours per MBit in Google's 2009 fleet [SPW11]. As a result, with long enough deployments of SPHINCS + on standard computers, the fault attack is eventually going to affect real-world users.
As ordinary hardware cannot be fully trusted to protect against faults, and since faults can also be maliciously injected, a proper countermeasure that entirely prevents the fault attack is preferable. However, the problem is not obvious to solve, as the universal forgery exploits the fact that the signing procedure recomputes one-time signatures; a core feature of the SPHINCS family that makes the scheme practical and stateless. Yet, as long as one-time signatures are being recomputed on the fly, the risk of reusing a one-time key pair to sign an unexpected message will always be present (which, in practice, is accomplished with a fault injection). While this problem is solved in stateful schemes such as XMSS MT by caching the relevant one-time signatures, Section 5 shows that the same countermeasure fails to properly protect SPHINCS + .
Since the threat of a fault can never be completely eliminated, the current best solution to protect the signature scheme against accidental and intentional faults is through redundancy; an observation that is shared by others (see [CMP18,ALCZ20]). Redundancy consists of recomputing a same signature multiple times (ideally, with different implementations) and abort the procedure in case a mismatch is detected. Even though parallelizable, this solution at least doubles the signing time which strikes a huge blow to the performance of the scheme which was already lacking in the original submission. Specially protected implementations on the hardware level, as recommended in the SPHINCS + specifications [HBD + 20], may also offer an adequate protection against faults but would require fault-protection mechanisms not only in the hash function implementation, but also in the other subroutines of the scheme, as well as in the device memory.
In conclusion, the results of this paper urge all real-world deployments of SPHINCS + to come with redundancy checks, even if the use case is not prone to faults (such as, e.g., with firmware updates). Unless an adversary can query the signature for any message, randomized signing may be disabled as such measure is not a reliable way to prevent the fault attack. Verification, on the other hand, is still recommended as non-verifiability (even though unlikely) implies the occurrence of a fault. Future work. The results of this paper call for novel countermeasures that make SPHINCS + inherently resistant to fault attacks. As argued above, such a solution should avoid the accidental or intentional recomputation of one-time signatures which will likely necessitate a new way of performing hash-based signatures. For instance, an ambitious reader might come up with a solution that changes the one-time signatures in SPHINCS + by one-message signatures which, if such a primitive makes sense, might even lead to an entirely new scheme. Other solutions that, for instance, make faulty signatures always non-verifiable would also be a desirable step forward, so a signing device could at least block bad signatures by running the verification procedure on the produced signatures.
Aside from researching countermeasures that make the scheme resistant to faults, investigating countermeasures that make the scheme resilient to faults could be of equal interest. A fault-resilient countermeasure does not prevent faulty signatures from being collected but from being exploited by hindering at least one step of the universal forgery. While preventing secret extraction or tree grafting would be difficult to achieve without significantly impacting the signing procedure performance (e.g., by replacing the one-time signatures by few-time signatures), a countermeasure that makes path seeking hard to find may reveal to be effective. Such a direction is left as an open problem.
At last, regarding the offensive side of the attack, as the current work is limited to faulting the hash functions, deriving similar attacks by faulting other subroutines of the scheme may lead to equally critical forgeries. Also, tampering with the control flow of a SPHINCS + software to force one-time signatures to sign unexpected messages would be an interesting direction to consider. Finally, differential fault attacks to recover secret values is yet another breach to explore.