Post-Quantum Authenticated Encryption against Chosen-Ciphertext Side-Channel Attacks

. Over the last years, the side-channel analysis of Post-Quantum Cryptography (PQC) candidates in the NIST standardization initiative has received increased attention. In particular, it has been shown that some post-quantum Key Encap-sulation Mechanisms (KEMs) are vulnerable to Chosen-Ciphertext Side-Channel Attacks (CC-SCA). These powerful attacks target the re-encryption step in the Fujisaki-Okamoto (FO) transform, which is commonly used to achieve CCA security in such schemes. To suﬃciently protect PQC KEMs on embedded devices against such a powerful CC-SCA, masking at increasingly higher order is required, which induces a considerable overhead. In this work, we propose to use a conceptually simple construction, the E t S KEM, that alleviates the impact of CC-SCA. It uses the Encrypt-then-Sign ( E t S ) paradigm introduced by Zheng at ISW ’97 and further analyzed by An, Dodis and Rabin at EUROCRYPT ’02, and instantiates a post-quantum authenticated KEM in the outsider-security model. While the construction is generic, we apply it to the CRYSTALS-Kyber KEM, relying on the CRYSTALS-Dilithium and Falcon signature schemes. We show that a CC-SCA-protected E t S KEM version of CRYSTALS-Kyber requires less than 10% of the cycles required for the CC-SCA-protected FO-based KEM, at the cost of additional data/communication overhead. We additionally show that the cost of protecting the E t S KEM against fault injection attacks, necessarily due to the added signature veriﬁcation, remains negligible compared to the large cost of masking the FO transform at higher orders. Lastly, we discuss relevant embedded use cases for our E t S KEM construction.


Introduction
Over the years, a range of efficient and secure instantiations of cryptographic primitives have been established. In particular for asymmetric cryptography, RSA and ECC are the dominating schemes in practice. However, with the advent of quantum computers the established solutions will no longer provide the desired security. Shor [Sho97] showed that their underlying hardness assumptions can be efficiently broken using a sufficiently powerful quantum computer. To prepare for this threat, the National Institute of Standards and Technology (NIST) has launched a standardization effort for cryptography resistant against quantum computers [Nat]. The goal is to select cryptographic algorithms that perform well in the considered performance metrics, while withstanding any known quantum attack threat. These Post-Quantum Cryptography (PQC) schemes and their implementations have become an active area of research in recent years.
One of the use cases where post-quantum cryptography is of interest for embedded devices is secure (firmware) update. If the secure update functionality of a device is not post-quantum secure, all functionality updated with it, including updates to post-quantum cryptography, cannot be trusted since the update might have been compromised by a quantum adversary. One way a secure update can be performed classically is by performing an ECC-or RSA-based key exchange to agree on a symmetric keypair, to then send the update in a second symmetric phase. The update can be made post-quantum secure by switching the key exchange out for a PQC KEM, and ensuring the second phase utilizes a symmetric cipher of sufficient post-quantum security. While the latter is straight-forward, the former poses a great challenge for implementations on constrained devices; the keys are larger, or the performance is lower, especially when physical attacks are in scope.
Indistinguishability against chosen-ciphertext attacks (IND-CCA1), or adaptive chosencipertext attacks (IND-CCA2), are common security notions for (post-quantum) cryptographic schemes [RS91]. It ensures that the ciphertext does not leak information on the encrypted message or the secret key when an attacker has access to a decryption oracle for chosen ciphertexts. A slightly weaker notion is that of indistinguishability against chosen-plaintext attacks (IND-CPA), where the adversary instead has control over the plaintext and an encryption oracle.
In the embedded context, although a post-quantum cryptographic scheme can be IND-CPA/CCA secure, this alone does not provide sufficient security. The implementation on a constrained device is an attractive target for physical adversaries that can either passively measure side-channel information or actively disturb the computation to extract sensitive information. Several post-quantum constructions are particularly vulnerable to side-channel attacks that exploit specifically chosen ciphertexts to amplify the observed leakage. This approach, denoted in the following as Chosen-Ciphertext Side-Channel Analysis (CC-SCA), has been shown to be a severe threat to even schemes that have countermeasures added to thwart side-channel adversaries [UXT + 22].
The core issue for these schemes is the use of the so-called Fujisaki-Okamoto (FO) transform [FO99]. It allows to create an IND-CCA2-secure scheme from its CPA-secure counterpart. The transform adds adequate resistance against a black-box adversary, however does not account for leakage during its computation. In fact, its computation consists of multiple steps processing sensitive values, which allows a side-channel adversary numerous attack avenues. Countermeasures against side-channel attacks on the FO transform, like masking, therefore require many shares and are very costly with regard to performance [ABH + 22].
There are alternatives to the FO transform, like the zero knowledge proof techniques presented in [BMV17]. In their solution, each message is encrypted using two independent cryptosystems, and both ciphertexts are sent along with a non-interactive zero-knowledge proof that they correspond to the encryption of the same message under different keys. No such solution is used for known PQC schemes since its instantiation is more challenging, less generic and presumably more expensive than the FO transform. D'Anvers, Orsini and Vercauteren developed alternative ciphertext transformations to the FO transform for lattice-based encryption in [DOV21]. These alternatives are based on error term checking and do not apply to schemes such as NewHope, Kyber and Saber. In the symmetric setting, one way to achieve CCA security with protection against leakage is to use a Message Authentication Code (MAC). The MAC can be computed after encryption with e.g., AES with a pre-shared key and can be used by the receiver to verify the validity of the ciphertext before decryption. In addition, the MAC computation with the shared symmetric key requires side-channel protection. However, in the asymmetric setting of post-quantum cryptography, for many use cases, there is no pre-shared symmetric key available to perform this authentication.
In this work, we propose an alternative approach based on signcryption, precisely scheme, that we call EtS KEM, has improved resistance against side-channel attacks. This is achieved by replacing the FO transform, which manipulates a large number of sensitive variables, by a signature verification that only uses public data. This improvement comes with a data overhead of one PQC signature.
• We discuss the relevance of the EtS-based scheme for embedded use cases. The EtS construction makes security assumptions that do not work for all use cases. We discuss these and show that, most notably, EtS KEM can be applied to secure (over-the-air) update and consider other potential applications.
• We apply the scheme to the CRYSTALS-Kyber PKE and KEM [ABD + 19] to illustrate and analyze our proposal. We show that in the EtS KEM less components require the application of costly side-channel countermeasures such as masking compared to the standard FO-transform-based KEM (FO KEM). This decreases the cost of CC-SCA protection.
• Finally, we give performance estimates for the EtS KEM implementation compared to the FO KEM when combining CRYSTALS-Kyber with either CRYSTALS-Dilithium or Falcon. We show that when 3 or more masking shares are required (which is likely the case for standard microcontrollers), the cost of the EtS KEM is less than 10% compared to that of the FO KEM. This is including the impact of signature recomputation, an ad hoc countermeasure against fault injection attacks, and added SPA countermeasures.

Background
In this section, we introduce notations used and relevant definitions of security notions. We describe the Kyber KEM [ABD + 19] since we use it in following descriptions, illustrations and discussions, but our proposal can be adapted to other PQC KEMs using the FO transform to achieve CCA security, such as Saber [DKRV18].

Notation
We denote the ring of integers modulo q by Z q and the corresponding ring of polynomials Z q [X]/(x n + 1) by R q . We use lowercase letters (e.g., x) to denote elements in R q ; bold lower-case letters (e.g., b) represent vectors and bold upper-case letters (e.g., A) represent matrices with coefficients in R q . Sampling x according to a distribution χ is denoted by x ← χ. Sampling of matrices of polynomials is represented by X ← χ(R l1×l2 ), where all the coefficients of X are sampled independently from the distribution χ. The uniform distribution is denoted by U. We denote the centered binomial distribution as β η , for a positive integer η.

Indistinguishability under Chosen-Plaintext Attacks (IND-CPA)
The security of Public Key Encryption (PKE) is defined in the sense of indistinguishability under chosen-plaintext attacks. Formally, security in terms of indistinguishability is presented as a cryptographic game [Sho04,BR06], where a cryptosystem is considered secure, if no adversaries can win the game with probability significantly greater than of random guessing. Let A be a probabilistic polynomial-time adversary, that runs in two stages and aims to win the IND-CPA A PKE game, described below. In a first stage, A is given access to an encryption oracle Enc() to encrypt arbitrary (polynomially bounded) number of messages of its choice. In the second stage, A submits two distinct fresh messages m 0 , m 1 , and gets an encryption of one of the messages, c b . The adversary's goal is to decide which message m b is encrypted in a given ciphertext: A PKE is considered IND-CPA-secure, if for all efficient adversaries A there exists some negligible function negl(n) of the security parameter n, such that the advantage of A in winning the IND-CPA A PKE game is given by:

Indistinguishability under (adaptive) Chosen-Ciphertext Attacks (IND-CCA)
The standard security notion for KEMs is indistinguishability under (adaptive) chosenciphertext attacks [RS91]. Similarly to IND-CPA, an adversary is given access to an encapsulation oracle Encaps() throughout the attack, such that it can encapsulate an arbitrary number of keys of its choice. In addition, the attacker is given access to a decapsulation oracle Decaps(). IND-CCA-security provides stronger security guarantees compared to IND-CPA and is formalized in the following game: A KEM is considered IND-CCA-secure, if for all efficient adversaries A the probability of winning the IND-CCA A KEM game is negligible. More precisely, given some negligible function negl(n) of the security parameter n:

CRYSTALS-Kyber KEM
The PKE of Kyber consists of three operations: key generation, encryption and decryption, given in Algorithms 1, 3 and 2, respectively.

Ensure:
Public key pk = (seed A , b), secret key sk = s 1: 1: The IND-CPA-secure PKE scheme in the previous section can be converted into an IND-CCA secure KEM by applying an appropriate transformation. Kyber and many latticebased KEMs use a post-quantum variant of the Fujisaki-Okamoto (FO) transform [FO99] by Hofheinz, Hövelmanns and Kiltz [HHK17], however other transformations could be used to achieve CCA security [BMV17,RS91].
The resulting KEMs consist of a triplet of operations (KeyGen, Encaps, Decaps), given in Algorithms 4, 5 and 6, respectively. The CCA-transformation requires access to three hash functions G, H, H , modeled as random oracles, as well as the PKE scheme CPAPKE = (KeyGen, Enc, Dec). The only difference is the instantiation of these functions. In Kyber the hash-functions are instantiated with different symmetric primitives, based on the SHA3 standard. The key generation is similar to the one for Kyber.CPAPKE, with the difference that the secret key sk also includes the public key pk, the hash of pk and a secret random seed z. During encapsulation, a ciphertext c is returned together with a shared key K, where c is obtained by encrypting a random message m, sampled from the uniform distribution, and K is derived by hashing together the message, the public key and the ciphertext.

Side-channel security notions
In this section, we introduce the main side-channel security definitions and notions that we use in the remainder of this paper to define the protection profiles for the EtS KEM and the FO KEM.

SPA.
Simple Power Analysis (SPA) analyses a limited number of measurements to extract a secret value. It has been used to attack both symmetric and asymmetric cryptographic primitives, and has been shown to be particularly powerful for some post-quantum schemes exploiting chosen ciphertext leakage [XPRO20]. In its most extreme variant, the attack is limited to one single trace and specific attack strategies are employed to maximize the extraction of sensitive information [KPP20]. Note that in some scenarios, it is possible to repeat the measurement with the same inputs and intermediates. This is used to average the traces and significantly reduce the noise in the measurements. So while an SPA attacker might have access to a large number of traces, the amount of distinct leakages is still limited. Therefore, countermeasures against this type of attack usually do not rely on masking, but rather on more cost-efficient shuffling [HOM06] or, if possible, exploit parallel leakages [BMPS21]. Note that in case of CC-SCA on Kyber, the SPA is still very powerful and requires costly protection to achieve the desired security level [ABH + 22].

DPA.
In contrast to SPA, a Differential Power Analysis (DPA) adversary can measure the leakage of a large number of different intermediate values. This enables very powerful attacks [KJJ99] to extract long-term secret values and, therefore, requires costly protection measures to thwart them. Commonly, masking [CJRR99,PR13] is used, sometimes in combination with other countermeasures, e.g., shuffling, to increase the noise level.
Leveling. There have been some works that try to level the protection profile of a target scheme [ABH + 22, BBC + 20]. Instead of protecting every operation at the maximum level, e.g., with strong DPA countermeasures, the underlying algorithm is analyzed and protected at different levels. The parts that leak about ephemeral secrets, which cannot be targeted with DPA, are only hardened using more cost-efficient SPA countermeasures. This enables more efficient protected implementations, especially for schemes that have been designed with leveling in mind. Azouaoui et al. [ABH + 22] have shown that for standard Kyber leveling protection is negligible as all parts need to be protected with costly countermeasures given the potency of the CC-SCA SPA on the FO transform. In this work, we show that by relying on a public signature verification check, it is possible (for some use cases) to prevent CC-SCA and exploit leveling to significantly speed-up protected implementations.

Chosen-ciphertext SCA on the FO transform
The FO transform is a simple and well suited approach for lattice-based PQC PKE to reach standard CCA security. However, several recent works showed that its use still leaves a very powerful attack vector when physical attacks are considered [RRCB20, XPRO20, UXT + 22, NDGJ21]. In the following, we first provide a brief description of the FO transform as used in Kyber. Then, we give a short description of chosen-ciphertext side-channel attacks on the FO transform, that we refer to for conciseness as CC-SCA. Finally, we highlight recent results in the literature assessing and improving the cost of protecting PQC KEM implementations against these attacks [ABH + 22, BC22].

Fujisaki-Okamoto transform in Kyber
We illustrate the basic working of the FO transform in Figure 1. The core idea is to check the validity of a decrypted message m in the decapsulation phase by performing re-encryption. The obtained candidate ciphertext c is then compared with the original ciphertext c. If both ciphertexts are equal, the session key K is derived from the message m , the ciphertext c and the hash of the public key pk, otherwise a pseudo-random string K = H (z, H(c)) is returned. Since the decapsulation never indicates failure explicitly (e.g., by returning a failure symbol ⊥), the rejection of malformed ciphertexts is implicit. The FO transform is used in many PQC schemes because of its simplicity and efficiency.

Attack description
While classic chosen-ciphertext attacks are not possible on CCA-secure KEMs thanks to the FO transform, the attacks presented in [RRCB20, XPRO20, UXT + 22, NDGJ21] are able to use the side-channel leakage of the FO transform computation to target only the CPA-secure encryption. To do so, the adversary carefully crafts ciphertexts such that one bit of the decrypted message m depends on a single secret key coefficient. Since this message m is used as input for the deterministic re-encryption, the adversary then only has to distinguish between an encryption of 0 or 1 given leakage of the re-encryption, which includes a large number of leaking intermediates. The number of traces required for successful attacks on both unprotected and masked implementations are of the order of a few thousands for many PQC KEMs (see the results of Ueno et al. Table 7).

Authenticated key encapsulation against CC-SCA
In the previous section, we have recalled that while the FO transform grants CCA security to PQC KEMs like Kyber, it comes with significant drawbacks with respect to its sidechannel security, and in particular the great cost its protection against such attacks implies.
In this section, we introduce a different construction based on the Encrypt-then-Sign (EtS) method studied by An, Dodis and Rabin and shown to provide CCA security to CPA-secure PKE [ADR02]. We first describe the relevant security notions for the signcryption of [ADR02]. We then introduce the EtS KEM in Section 3.2.1, which is a straightforward application of signcryption to the post-quantum setting. The application areas that benefit from the EtS KEM are discussed in Section 3.2.2. We then discuss the side-channel security of the EtS KEM and how it compares to the standard FO KEM in Section 3.2.3.

The Encrypt-then-Sign paradigm
In this section, we first introduce the relevant security notions necessary to define the security of the Encrypt-then-Sign paradigm. We will end with the security guarantees and theorems relevant for this work.
Signcryption. We denote the sender by S and the receiver by R. We assume S uses signcryption, i.e., a scheme in which a message m is first encrypted by an encryption scheme E and then signed by a signature scheme S as u = SigEnc(m). R can then verify and decrypt with a deterministic de-signcryption algorithm m = VerDec(u). In this setting, beyond the integrity and confidentiality of the message, we would like to protect S's authenticity and R's privacy.
IND-gCCA2-security. In Section 2.3 we discussed the IND-CCA2 security of the Kyber KEM. Generalized CCA2 security (gCCA2) was introduced in [ADR02] (also called 'benign malleability' in [Sho01]) and offers a slightly weaker notion of security. It is defined as having some relation R for which it holds that for distinct ciphertexts e 1 , e 2 , if R(e 1 , e 2 ) = true, then Dec(e 1 ) = Dec(e 2 ). Such a relation R is called a decryption-respecting relation.
An example is to append a ciphertext with an arbitrary byte, which is ignored during decryption. This cipher is then not CCA2 secure, since the ciphertext can be adapted, but it can be considered secure in almost all use cases of CCA2 encryption, since the adaptation is 'benign'. Note that since the notion of gCCA2 security is a relaxation of CCA2 security, any IND-CCA2 secure encryption scheme is also gCCA2 secure.
UF-NMA/CMA-security. For signature schemes, UnForgeability against No Message Attack (UF-NMA) security describes the notion in which the adversary A attempts to create a forged signature from the scheme's public key without accessing a signing oracle. A slightly different notion is that of UnForgeability against Chosen Message Attack (UF-CMA). In this setting A can make queries to a signing oracle and must forge the signature of a previously unqueried message. UF-CMA and UF-NMA security have been shown to be tightly equivalent in certain settings like deterministic signature schemes [KLS18].
In the context of signature schemes, we can also distinguish weak UF-CMA-security (wCMA) and strong UF-CMA-security (sCMA). The weak case is equivalent to the description above, where an adversary wants to forge the signature of a previously unqueried message. In the strong case, the adversary is deemed successful even when they forge a previously queried message, as long as the signature differs from the queried result.
Insider-vs. outsider-security. The third security notion we need is the distinction between insider-and outsider-security. In the outsider-security setting, we assume that A is privy only to public information, i.e., the public keys of S and R, pk S and pk R , and oracle access to the functionalities of S and R. Specifically, they can query (the functionality of) S, a signed encryption u of a chosen message m (i.e., the signcryption oracle computing u = SigEnc(m)). Similarly, they can query (the functionality) of R by providing a signed encryption u and receiving the result m, which could be ⊥ (i.e., the de-signcryption oracle computing m = VerDec(u)). This setting is called outsider-security, because it aims at an adversary that is outside of the protocol.
The stronger notion of insider-security also includes the option that A is R or S. It aims to protect S's authenticity (respectively, R's privacy) even in the case that A is using the system as R (respectively, S).
Encrypt-then-Sign (EtS) security. Given these notions, we have the following theorem on the security of signcryption. We see that under security assumptions on the schemes E and S, the signcryption scheme EtS is also secure (in specific security models). In Section 3.2, we will leverage this theorem for secure communication in specific use cases.
Quantum security of EtS. In [CPPS20] the security of signcryption was shown under the extension of the security model to include a quantum adversary. In particular, they show that in the outsider-security setting, the post-quantum CPA security is amplified with EtS if the base signature scheme satisfies slightly stricter security definitions.
Here the pq and q notation denote the CPA/CCA2 security in the quantum setting. Note that in the insider-security setting, pqIND-CPA-security of the PKE does not suffice, and IND-qgCCA security is required.

Scheme description
We first describe the EtS key encapsulation scheme in Figure 2. We refer to the encapsulator as the server and to the decapsulator as device, to highlight that in the relevant use cases of this scheme, power-based side-channel attacks can only be a concern on the embedded device's side. In the following, we detail the steps of the scheme: • First, a signing keypair (sk s , pk s ) is generated by the server and shared with the device. The dashed horizontal line and gray background emphasizes that this step can be performed off-line and the public key can be pre-provisioned onto the device. Otherwise, a root certificate is pre-provisioned and (possibly ephemeral) signing keys can be generated by the server and verified by the device given their corresponding certificate.
• Next, the KEM key generation and encapsulation are performed sequentially by the device and the server, respectively. However, in this new construction, the ciphertext c is signencrypted (SigEnc) using the device's public key pk d and the server's secret key sk s . The ciphertext along with its signature σ are transmitted to the device.
• On receiving the ciphertext and its signature, the device starts the verification and decryption process (VerDec). Prior to initiating any decryption using sk d , the device first authenticates the ciphertext's source, by verifying its signature σ, using the server's authenticated public key pk s . If the ciphertext is verified, then it is decrypted and the shared key is derived from the decrypted message m 1 . Otherwise, an implicit rejection is performed as in the original KEM.

Security of the EtS KEM.
We see that the presented EtS KEM is a direct application of the EtS scheme of [ADR02]. Therefore, by Theorem 2, if the used PKE is pqIND-CPAsecure and the used signature scheme DS is strongly UF-NMA-secure, then the EtS scheme of Figure 2 is IND-qCCA2-secure and UF-qCMA-secure in the outsider-security model. If a weakly UF-NMA-secure signature scheme is used instead, the EtS scheme can be proven IND-qgCCA2-secure.
Two-user vs. Multi-user setting. It should be noted that the EtS KEM can be extended to a multi-user setting. In this case, the concept of identity needs to be introduced to the scheme to be able to differentiate different actors in the communication. In [ADR02], this is achieved by adding the sender's identity (which can potentially be their public key) to the encrypted message, the receiver identity to the signed message, and having R output ⊥ if the identities are not as expected.
Although we focus in this work on the two-user setting, we remark that the scheme of Figure 2 will suffer a slight performance impact from allowing multiple users. Since the identities need to be included in the ephemeral key K encryption, this will increase the size of the message and therefore in the worst-case increase the encryption time. It might be possible to (non-trivially) include the identities in a manner maintaining the message length, but we leave this aspect and the applications of EtS authenticated post-quantum CCAKEM in the multi-user setting for future research.

Applications
In Section 3.2.1, we presented a pqEtS KEM scheme that is CCA2 secure in the outsidersecurity model. It offers significant advantages to the 'standard' PQC KEM with regard to side-channel protection, but only works for use cases where the outsider-security model holds. In this section, we discuss some possible application areas.
Secure update mechanism. Secure encrypted updates are critical to ensure that a device is running in a secure manner with optimal performance throughout its lifecycle. Firmware updates are often administered locally via a network, or Over-the-Air (OTA). During the update process an embedded device is in its most vulnerable state; if an adversary is able to compromise the content of the update, it can render the device insecure for the remainder of its lifecycle. Therefore, a secure update protocol is essential.
Different strategies can be taken to securely update a device. The provisioning method, the updater scheme and the underlying cryptography can differ greatly per use case. However, a high-level depiction of an update protocol is depicted in Figure 3. To perform the update in a post-quantum secure manner, a general strategy can be to perform a PQC Key EXchange or KEM first, to agree on a shared secret key, and then send the encrypted update by way of symmetric cryptography. In many cases, this is much faster than sending the entire update asymmetrically encrypted and signed.
We notice that secure update is an excellent candidate to apply the pqEtS KEM scheme. Firstly, if a KEX/KEM mechanism is used, it needs to be protected against side-channel attacks and in particular against CC-SCA. If not, the device is vulnerable against an adversary tampering with the update, and thereby reducing the security. Secondly, the assumptions made in the outsider-security model hold. For a firmware update both parties are trusted, and the necessary digital signature certificates can be provisioned in Step 1 of Figure 3. Therefore, if the pqEtS KEM is applied, Theorem 2 ensures its post-quantum security 2 .
When we apply the pqEtS KEM scheme, it replaces Step 4 in Figure 3. The costlyto-mask CPAPKE.Enc is replaced with a PQC digital signature. This does mean that the size of the initial transaction grows; PQC digital signatures that are well-suited for embedded applications usually have signature sizes of thousands of bytes. However, since the subsequent transaction (the sending of the firmware update) also consists of sending large amounts of data, the DS size is negligible for most use cases. In Section 4, we will show that using the pqEtS KEM scheme in a secure update context can improve performance by at least a factor 10 (for 3 shares and upwards).
A concrete example of such an update protocol is the SUIT (Software Updates for Internet of Things) solution [MTBM21]. The PQC considerations related to SUIT without firmware encryption were recently analyzed by Banegas et al. [BZH + 21]. The version of SUIT with firmware encryption [THM22] uses a HPKE (Hybrid Public Key Encryption) [BBLW22] to establish a shared symmetric key to encrypt/decrypt the firmware image. This HPKE can be instantiated with a PQC CCA-secure KEM, hence when side-channel attacks are in scope the EtS KEM can be used for this purpose as well.
Other applications. Although secure update is our main focus in this work, there are other (embedded) applications of the pqEtS KEM scheme. One area would be that of secure element to MCU communication. For this communication the Secure Channel Protocol [Glo] can be utilized which, similarly to the secure update, starts with a KEX or KEM to establish the session keys. In this case, the secure element is a trusted party and the outsider-security model applies. This communication also needs to be side-channelsecured. Since the secure element often acts as a trust anchor for the System-on-Chip, its compromise means the entire device is compromised. This would therefore be a good candidate to apply a pqEtS KEM scheme, although it has to be taken into account that the resulting protocol is not standardized (yet).
A second area we see applications for the pqEtS KEM scheme is in edge computing for the Internet-of-Things. This is in essence the simple concept of performing computations at the edge node of a network instead of in the cloud. This improves security and privacy since less data is sent over an Internet connection while at the same time being able to offload heavier computation like machine learning from more-restrained embedded devices. In such a network, the edge node acts as a trusted party and often it is not necessary for all devices to communicate pairwise: only communication with the edge node is necessary, and therefore the two-user pqEtS KEM scheme can be applied. This can be generalized to a multi-user setting where all devices do communicate, like a private network, however this does slow down the protocol a little, as was discussed in Section 3.2.1.
In general, whenever power-based side-channel attack protection is required on the decapsulator/receiver's side and not on the encapsulator/sender's side, and where the outsider-security model applies the use of a pqEtS KEM scheme can be considered.

Side-channel security of the EtS KEM
To argue about the side-channel security and the different levels of side-channel protections necessary for all the components of the EtS KEM, we introduce the two main SCA adversaries on KEMs: The goal of both adversaries is to extract either the long-term secret sk d or the ephemeral encapsulated key m (or the secret keys derived from it, K and K). Sticking to our chosen use case, the key K is not explicitly returned to the adversary but rather further used internally to decrypt the update image. In accordance with [BDK + 21, BGR + 21], we exclude the protection of the value z from our analysis 3 . It is obvious, that A CC−SCA is strictly stronger than A KC−SCA , which leads to costly protection requirements for CCAKEM.Decaps. In the following, we show that for our proposed scheme, the two adversaries are equivalent and, thus, it avoids the costly protection overhead.
For both the EtS KEM and the standard FO KEM, we analyze each step of the decapsulation computation and argue about their protection requirements, i.e., SPA, DPA or no protection required, against the two SCA adversaries. The high-level intuition is that DPA protection is commonly more costly than SPA protection (or no protection at all) and it is therefore beneficial to limit the number of modules that require this level of protection. Note that we do not distinguish between SPA with and without averaging like some prior works [BBC + 20], as in our scenario averaging is always possible.
The FO KEM. The protection level assignment for an FO-based KEM, following its description in Figure 4, is provided in Table 2, and we provide some rationale about the coloring in the following. As the initial CPAPKE. Dec(sk d , c) manipulates the long-term secret sk d together with an adversary-controlled input c, it requires DPA protection for both cases. For A KC−SCA , the following intermediates are both mostly unknown, and more importantly independent of the long-term secret sk d . Therefore, most require only SPA protection. Since the comparison will be true up to a negligible failure probability of the underlying scheme, i.e., c = c, it can be made public and does not require dedicated SCA protection. A CC−SCA can craft specific ciphertexts such that m and the values derived from it leak information about the long-term secret sk d . Therefore, it is necessary to protect the modules processing these values with strong protection measures. Note that the comparison for these chosen ciphertext is always false up to a negligible failure probability. Therefore, for these inputs K is always derived from z and the public value c, which can be left unprotected. The key derivation based on K is only computed for valid ciphertexts and requires only SPA protection as in the case of A KC−SCA .   The EtS KEM. The description of the EtS KEM was provided in Figure 2. The security level assignment for the EtS KEM is provided in Table 3, and again we provide some rationale about the coloring in the following. Since the verification processes only public values (i.e., pk s , c, σ), it does not require SCA protection for any of the two adversaries. For A KC−SCA , the protection profile stays the same, as valid ciphertexts pass the verification and the intermediates after CPAPKE.Dec do not leak about the long-term secret sk d . For A CC−SCA , the specifically-crafted chosen ciphertexts do not pass the verification, as the adversary does not have access to the secret signing key. Therefore, these inputs directly lead to the key derivation of K based only on z and c, which is unprotected as before.

Device
The only inputs that trigger operations processing the long-term secret sk d are valid ciphertexts, and for these the intermediates after CPAPKE.Dec require only cost-efficient SPA protection.
Comparison. The impact of our proposed scheme over the original approach is visually noticeable from Tables 2 and 3. In the original scheme, the adversary A CC−SCA is strictly stronger than A KC−SCA and requires very costly DPA protection for many parts of CCAKEM.Decaps. Leveling is only partially possible with marginal impact on the overall performance. By introducing an explicit authenticity check based only on public values,  the potency of CC-SCA is completely negated. This is visible in Table 3, which shows that A CC−SCA requires an equivalent protection profile to A KC−SCA . Effectively, this leaves only CPAPKE.Dec as a module with DPA protection requirements enabling significantly more efficient hardened implementations as will be demonstrated in Section 4. It should be noted, however, that the signature verification is the single point of failure for chosen ciphertext attacks. If this is skipped, e.g., due to an injected fault, the powerful attack vector is possible again. Therefore, it requires dedicated fault protection measures, if fault attacks are in scope. However, typical fault attack countermeasures, such as recomputing and comparing the results, induce a linear overhead on the total cost, as opposed to DPA countermeasures such as masking, which are significantly more expensive.

Illustration with lattice-based cryptography schemes
In this section, we provide a comparison of the FO KEM and the EtS-based KEM in terms of performance and communication overhead. For this purpose, we first describe in Section 4.1 the different parameters affecting these overhead measures and later provide a comparison for the STM32F4 ARM Cortex-M4 MCU used to benchmark NIST PQC candidates in pqm4 [KRSS19] in Section 4.2. We consider two different combinations of lattice-based schemes for the EtS-based KEM: instantiating the PKE with CRYSTALS-Kyber.CPAPKE and the digital signature with either CRYSTALS-Dilithium or Falcon. We additionally discuss the impact of introducing a signature verification function which requires fault attack protection in Section 4.3.

Parameters
Our comparison involves different parameters, including the choice of post-quantum PKE/KEM, digital signature, side-channel and fault attack countermeasures.
For lattice-based FO KEM, we consider the CRYSTALS-Kyber KEM [ABD + 19], described previously. Table 4 shows the cost in kCycles for masking the Kyber decapsulation and relevant subroutines. As studied in [ABH + 22] the signal-to-noise ratio of the leakage of the device and the target security level determine the number of shares required for masking. To capture the effect of both parameters, we consider different share numbers d ∈ {2, 3, 4, 5, 6, 7} for higher-order masking. We also provide numbers for unprotected decapsulation for comparison.
When it comes to lattice-based signatures, we consider two options: the CRYSTALS-Dilithium [DLL + 17] signature scheme for NIST level 3 and the Falcon-1024 signature scheme [FHK + 19] for NIST level 5. Since, our running example and performance figures are provided for the Kyber level 3 instance, we use the corresponding Dilithium level 3. However, Falcon does not have a level 3 instance, and only a level 1 (Falcon-512) and a level 5 (Falcon-1024) parameter set. Since the security of the EtS scheme relies first and foremost on the security of the signature, we consider Falcon-1024 with NIST level 5. In the following analysis, we are mainly interested in the performance on the device's side, and hence in the signature verification. For this, we provide in Table 5 the speed in kCycles, public key and signature sizes based on the pqm4 benchmark [KRSS19]. While hash-based signature schemes (e.g., SPHINCS+, LMS, XMSS) can also be used for the EtS KEM, we do not consider them in our analysis since they are more expensive (in terms of signature size, generation and verification) compared to lattice-based signature schemes. While the EtS scheme allows us to get rid of the leaky re-encryption, it introduces a new attack vector which is the signature check. Specifically, a signature verification performs a validity check based on the input message (more accurately its hash) and signature, that can be bypassed by fault injection, and grant adversaries the possibility to force the validation of any message-signature pair. The straightforward countermeasure against fault injection attacks is re-computation. Re-computing or duplicating the verification protects against the injection of a single fault. In general, by re-computing f times, we protect the target function against f − 1 faults.
As previously discussed, compared to a FO-based KEMs, the EtS KEM reduces the attack surface for DPA, and instead only requires SPA protection for some parts of the scheme. Countermeasures against SPA include shuffling [HOM06], which randomizes the order of the performed operations. The addition of side-channel noise can be achieved by different hardware or algorithmic means and aims to reduce the signal-to-noise ratio in the side-channel measurement. Overall, SPA counteremasures are typically less expensive (induce a linear overhead) than stronger DPA countermeasures such as masking (with quadratic overhead).

Performance comparison
We provide in Table 6 the performance values in kCycles for the EtS KEM VerDec function using Dilithium 3 or Falcon-1024, for different masking orders. The signature verification, the decryption and the key derivation steps are performed sequentially and the overall cost of VerDec is the sum of all its subroutines. The first column of the table corresponds to the masked Kyber.CCAKEM regular decapsulation. The first row corresponds to an uprotected implementation. First and with no surprise, when considering the unprotected case the FO-based Kyber KEM is more efficient than its EtS counterpart. This is due to the large cost of signature verification for PQC signature schemes. However, in the masked case we observe that the EtS schemes are significantly more efficient than the Kyber.CCAKEM decapsulation. When the noise level on the device decreases and, thus, the number of shares increases accordingly to achieve a target security level, the EtS schemes become more and more efficient, since they do not require a costly masked re-encryption. For instance, when masking with only 2 shares is sufficient, the EtS schemes with Dilithium and Falcon achieve an improvement of approximately 20% and 60%, respectively. When considering more sensible and larger share numbers to protect implementations on standard MCUs, the EtS schemes achieve similar improvements ranging from 90% to 92%. The improvement gap between the EtS scheme using Dilithium and the one using Falcon shrinks with the number of shares, since the cost of the signature verification becomes minimal next to the cost of the Kyber.CPA decryption and the KDF.
The main drawbacks of the EtS schemes lie in the encapsulation and communication overheads. First, since the EtS KEM encapsulation process includes a signature generation, its cost increases from 786 kCycles to 10 075 and 84 269 for EtS with Dilithium and Falcon, respectively. In the relevant usecases of the EtS KEM this cost is less detrimental than the one of the masked decapsulation. Regarding the data overhead, for Kyber level 3, the ciphertext size is 1088 bytes. For the EtS schemes using Dilithium level 3 and Falcon level 5, the total ciphertext sizes (including the signature) are 4,381 and 2368 bytes, respectively. As discussed previously, based on the usecase, this can be worthwhile 4 . Interestingly, the choice of signature scheme can be based on a compromise between the data and performance overheads, e.g., while Falcon has relatively small signatures and fast verification, its signature generation is significantly more expensive than Dilithium's.

Fault attacks mitigation for signature verification
Next, we examine the impact of the introduction of a signature verification in the EtS schemes with respect to fault attack resistance. We show on Figure 5 the efficiency impact on the EtS schemes, when the signature verification is protected against f − 1 faults, which requires recomputing the verification f times, and comparing the results, to detect any injected fault. From Figure 5, we see that for a low number of shares the EtS schemes are potentially less efficient than the Kyber.CCAKEM.Decaps when the signature verification is relatively costly. This can be remedied by using a more efficient signature algorithm such as Falcon. However, when the number of shares increases (d>3), the gap between Kyber.CCAKM.Decaps and the EtS scheme with multiple signature verification re-computations increases rapidly. The impact of the re-computation on the overall cost diminishes with the number of shares. This is as expected since the cost of the signature verification is smaller than the cost of the masked decapsulation at high orders and the re-computation only induces a linear overhead, whereas the high order masking on the other hand induces a quadratic overhead.

Considerations for SPA security
As discussed in section 3.2.3, the EtS KEM increases the number of operations to protect against SPA. Accordingly, in this section we take a closer look at the impact of SPA countermeasures on the FO-based KEM and the EtS KEM. While the kind of SPA countermeasure to implement and its parameters are determined by the noise level on the considered device and the target security level, we adopt a general simplification to study its impact. Precisely, we assume that to achieve the same security for the SPA targets as for the DPA ones, we can mask the SPA targets using 2 less shares than the targets requiring DPA protection. Arguably, a cheaper countermeasure such as shuffling can also be used to achieve adequate security, however it is highly dependent on the number of independent operations at each stage of the considered function. Figure 6 shows the extrapolated costs from Table 4. We see the overhead introduced by the SPA mitigation is more pronounced for the EtS schemes compared to the FO-based schemes. This is expected since the EtS schemes have more SPA targets. The main conclusion from this analysis, is that despite protecting the SPA targets with an expensive countermeasure such as masking, the EtS KEM still remains signficantly more efficient than its FO-based counterpart.

Conclusion
In this work, we combine a standard cryptographic construction, namely the EtS paradigm, and observations from recent side-channel analysis of post-quantum KEMs. Our main result is to enable efficiently hardened authenticated post-quantum public key encryption, that can be used to instantiate a KEM, without the need to protect the costly FO transform against CC-SCA. While the initial concept is simple, it surprisingly allows speeding up the KEM by a factor 10. However, the EtS construction can only lift CPA security to CCA security under the outsider-security model. Therefore, we discuss applications of the EtS KEM that conform to this model. The most notable is the secure update mechanism, which is essential in maintaining the security and the reliability of embedded and IoT devices.
The side-channel protection of post-quantum schemes is a recent, but quite active research direction for the academic community. Accordingly, we expect more efficient masked KEM implementations in the next few years. However, the main bottleneck when masking FO-based KEMs, such as Kyber or Saber, stems from the multiple calls to the hash functions in the re-encryption introduced by the FO transform, that require high-order masking to hinder CC-SCA. Since the EtS KEM gets rid of the need of re-encryption, we expect that the improvement brought by using EtS should transfer to more optimized implementations as well in the future.
Eventually, future work could explore other practical applications of the EtS KEM, and additionally, in other contexts, e.g., the multi-user setting, and whether it is suitable for some specific purpose protocols (e.g., secure element to MCU communication or IoT edge computing). Another research direction could be to design post-quantum cryptography schemes that are naturally resistant against implementation attacks by leveraging the large body of work related to leakage resilience, instead of focusing on protecting schemes that were not designed with physical attacks in mind.