FIVER – Robust Veriﬁcation of Countermeasures against Fault Injections

. Fault Injection Analysis is seen as a powerful attack against implementations of cryptographic algorithms. Over the last two decades, researchers proposed a plethora of countermeasures to secure such implementations. However, the design process and implementation are still error-prone, complex, and manual tasks which require long-standing experience in hardware design and physical security. Moreover, the validation of the claimed security is often only done by empirical testing in a very late stage of the design process. To prevent such empirical testing strategies, approaches based on formal veriﬁcation are applied instead providing the designer early feedback. In this work, we present a fault veriﬁcation framework to validate the security of countermeasures against fault-injection attacks designed for ICs. The veriﬁcation framework works on netlist-level, parses the given digital circuit into a model based on Binary Decision Diagrams, and performs symbolic fault injections. This veriﬁcation approach constitutes a novel strategy to evaluate protected hardware designs against fault injections oﬀering new opportunities as performing full analyses under a given fault models. Eventually, we apply the proposed veriﬁcation framework to real-world implementations of well-established countermeasures against fault-injection attacks. Here, we consider protected designs of the lightweight ciphers CRAFT and LED-64 as well as AES. Due to several optimization strategies, our tool is able to perform more than 90 million fault injections in a single-round CRAFT design and evaluate the security in under 50 min while the symbolic simulation approach considers all 2 128 primary inputs.


Introduction
In 1997, Biham and Shamir proposed Differential Fault Analysis (DFA) as powerful attack against implementations of cryptographic algorithms [BS97]. In the aftermath, this seminal work sparked a new branch of research dealing with cryptographic Fault Injection Analysis (FIA). Over the last two decades, researchers mainly pursued three sub-topics, covering the development of new analysis techniques, the enhancement of fault injection methods in hardware devices, and the design of effective countermeasures.
Ultimately, researchers from academia and industry proposed a plethora of countermeasures to secure cryptographic operations against FIA. More precisely, modern countermeasures leverage redundancy in time, area, or information and can be classified as detection-based, correction-based, and infection-based techniques While detection-based countermeasures were intensively investigated in [AMR + 20], using linear error codes, originally known from coding theory, to protect symmetric block ciphers, this approach was extended in [SRM20] such that linear error codes also were deployed to correct occurring faults. The last class -infection-based countermeasures -was investigated in [GST12] and infects the state of a cryptographic algorithm with random bits in case a fault occurred intending to generate useless outputs for an attacker.
However, despite extensive theoretical research on efficient and effective countermeasure, in particular the process of designing and implementing such methods in practice is still an error-prone, complex, and manual task, requiring longstanding experience and expertise in hardware design and physical security. Furthermore, the correctness and security of implemented designs and countermeasures are predominantly evaluated through empirical testing of prototypes or final products, making it difficult to correct or adjust design deficiencies and security flaws. To counteract this issue, formal verification can support the designer during the design process and provide an early indication of deficiencies and flaws.
As a consequence, the approach of empirical testing is replaced by security proofs which, however, require an appropriate definition of the adversary model and an abstraction of fault injection methods. Recently, the authors of [RBSG21] proposed a new and generic fault model which allows to precisely define and describe an attacker. While the work from Arribas et al. [AWMN20] already presents a fault verification tool called VerFI, directly working on a gate-level netlist of a cryptographic circuit, it works with a limited set of fault models (i.e., bit-flip and stuck-at) and exposes some open challenges, particularly with respect to the reliability of the reported results. More precisely, as the tool is simulation-based, the user has to select dedicated test vectors, which can lead to undetected corner-cases and false-positive results. To this end, in this work, we close this gap by proposing a verification approach that inherently prevents misleading evaluation results and reports.

Contributions.
We propose a formal verification approach and corresponding tool for countermeasures against FIA for cryptographic algorithms implemented on Integrated Circuits (ICs). Hence, similar to VerFI [AWMN20], our approach works on a given gatelevel netlist serving as starting point to create a model of the underlying digital logic circuit. However, instead of relying on empirical testing methods, we present a formal verification approach that is based on Binary Decision Diagrams (BDDs). This data structure inherently provides the possibility to observe the output of a Boolean function considering all combinations of the input variables. Hence, we avoid false-positives that could be created by selecting inauspicious test vectors. Additionally, we propose a symbolic fault injection approach allowing to cover all possible fault events that can occur in digital logic circuit under a given fault model while avoiding to detect any undiscovered corner cases that may lead to successful fault injection attacks.
Furthermore, instead of fixing the applied fault model to a predefined subset as in VerFI, we incorporate the generic fault model from [RBSG21]. This permits a precise definition and description of the adversary while analyzing the countermeasure's claims.
In order to achieve reasonable performance with respect to the evaluation time and circuit size, we present different kinds of optimization strategies. On the one hand, these strategies address the reduction of the complexity of the number of fault combinations that can occur in a digital logic circuit. On the other hand, we propose several approaches to increase the performance of our fault verification tool. The tool is publicly available and can be accessed via https://github.com/Chair-for-Security-Engineering/FIVER.
Outline. In Section 2 we briefly summarize notations and BDDs, introduce our circuit model, and provide background to FIA. Based on these preliminaries, we present our fault verification concept in detail in Section 3. Afterwards, in Section 4, we introduce our fault verification tool FIVER by providing more details about the applied BDD library, about the tool flow and optimization strategies. In Section 5, we present practical evaluations and experiments applying our tool to common ciphers equipped with detection-based and correction-based countermeasures. Eventually, we conclude our work in Section 6.

Background
In this section, we briefly state our notations used throughout this work. Afterwards, we provide essential background of BDDs and introduce our circuit model. We conclude this section by discussing fault injection analysis and the fault verification tool VerFI.

Notation
We denote function by using sans-serif fonts, i.e., f. While we express single-bit values by x i , the corresponding multi-bit variable is denoted by x. Upper-case Greek letters are used to denote sets, e.g., Λ.

Binary Decision Diagrams
BDDs have been introduced by Akers [Ake78] and refined by Bryant [Bry86] (introducing variable ordering), providing a compact and concise data structure to represent Boolean functions in discrete mathematics and computer science. These days, many applications in Electronic Design Automation (EDA) and computer-aided IC design and verification rely on (reduced and ordered) BDDs 1 Given any Boolean function f : F i 2 → F 2 , BDDs provide a concise and canonical (for a given variable ordering) graph-based representation, with a single root node and at most two terminal nodes (leaves) {0, 1}. The formal definition of BDDs is given as follows, divided into a purely syntactical description and a semantical interpretation.

Syntactical Definition of BDDs
Each BDD can be represented as a finite Direct Acyclic Graph (DAG) according to the following syntactical definition.
Definition 1 (BDD Syntax). A Reduced Ordered Binaray Decision Diagram is a pair (π, D), with π denoting the variable ordering and D = (V, E) describing a finite DAG with vertices V, edges E, and the following properties: (1) Given a single root note, each node v ∈ V is either a non-terminal node or one of the terminal nodes {0, 1}.
(2) Each non-terminal node v is labeled with a variable in X, with |X| = n, denoted as var(v) and has exactly two child nodes in V, denoted as then(v) and else(v).
(3) For each path from root to a terminal node, the variables in X are encountered at most once and in the order defined by the variable ordering π. More specifically, the ordering π is a bijection π : {1, 2, . . . , n} → X.

Semantical Definition of BDDs
Each BDD, with root node v ∈ V, recursively represents a Boolean function f : F i 2 → F 2 using the Shannon decomposition of f according to the following definition.
Definition 2 (BDD Semantics). A Boolean function f over X is defined recursively by a BDD, carried out at each node according to the following rules:

Boolean Operations over BDDs
Given the syntactical and semantical definitions of BDDs, any arbitrary Boolean operation • over two Boolean functions f v1 and f v2 , given as BDDs with root nodes v 1 and v 2 , can be defined recursively, such that:

Circuit Model
As our goal is to verify hardware implementations protected against fault injection attacks, we introduce an abstract model to describe the underlying circuits. To this end, we construct such a model based on a gate-level netlist describing a digital logic circuit C. Naturally, it is assumed that the circuit C realizes an arbitrary (vectorial) Boolean function f : F i 2 → F o 2 with input size i ≥ 1 and output size o ≥ 1. At the lowest level, we decompose the circuit C into atomic components, called gates, which can be further divided into purely combinational gates and sequential gates.
Definition 3 (Combinational Gate). A combinational gate g c is a physical component in a digital logic circuit that evaluates its output as a pure (Boolean) function of the present inputs only (without any dependency on the history of inputs).
In this work we limit the set of Boolean functions implemented as combinational gates by G c = {not, and, nand, or, nor, xor, xnor}. We further distinguish between gates with fan-in size one (unary gates) and size two (binary gates) leading to the sets G u = {not} and G b = {and, nand, or, nor, xor, xnor} such that G c = G u ∪ G b .
Definition 4 (Sequential Gate). A sequential gate g s ∈ G s is a physical, clock-synchronized, memory component in a digital logic circuit for which the output depends not only on the present inputs but also on the history of previous inputs. Sequential (memory) gates store a single Boolean variable x ∈ F 2 while we model them as clock-dependent synchronization points. We suppose that a sequential gate has only one output. Some standard libraries make use of flip-flops with two complementary outputs. In such cases, the gate is further decomposed to a sequential gate followed by a combinatorial gate not. Analogously to the definition of combinational gates, we define the set G s = {reg}. Given that, we unite all valid gates in one set G = G c ∪ G s to properly model a digital logic circuit C as defined in Definition 5.
Definition 5 (Circuit Representation). A digital logic circuit C is modeled by a DAG formally described by D = {V, E}, with V the set of vertices and E the set of edges. A single vertex v ∈ V represents a combinational or sequential gate g ∈ G and a single edge e ∈ E represents a wire connecting two vertices v 1 , v 2 ∈ V and carrying a digital signal, modeled as an element from the finite field F 2 .
Based on this definition, the model cannot handle circuits with loops, which is a common practice in digital logic circuit designs 2 . Hence, to allow our model to handle such cases, the circuit needs to be unrolled before it is translated to a DAG. By unrolling we describe the process of removing any cyclic loops and replacing them by acyclic structures. For example, a cryptographic round-based implementation of an arbitrary block-cipher can be unrolled by instantiating the round logic r-times connecting them separated by registers where r denotes the number of rounds.

Fault Injection Analysis
Physical Fault Injection Techniques. In general, Fault Attacks (FAs), as a particular branch of physical attacks, aim at disturbing the regular execution of a physical device by forcing it to operate under non-specified conditions. For this, common approaches attempt to maliciously induce faults during operation in order to violate the timing requirements of a digital logic circuit, where the most prominent techniques use clock glitches or voltage glitches [SGD08, ADN + 10, GSD + 08]. More specifically in the former approach, the period of the clock is tightened for one (or a couple of) cycles while in the latter, the power supply of the target device is altered for a short moment to disrupt its normal execution.
Besides timing violations, it has been shown that varying the ambient operating conditions, such as the temperature, can also lead to a faulty behavior [GA03,HS13]. However, while all aforementioned techniques are non-invasive as they do not require a modification of the targeted device, semi-invasive attacks allow injecting localized faults with high precision. Particularly, decapsulation of the chip package enables an attacker to induce faults using electromagnetic pulses [ORJ + 13, OGT + 14, BKH + 19] or an intense light source like laser beams [SA02, SFG + 16].

Consolidated Fault Models.
Fault models are widely used as an abstraction of an undesired physical event, enabling hardware designers to predict the circuit behavior in the presence of the faults. For this, stuck-at and toggle (bit-flip) fault models are commonly adduced and consolidated models, often considered in IC test and verification processes [RU96], but also in FIA.
For the stuck-at-0 (resp. stuck-at-1) model, it is assumed that an individual signal, e.g., the output of a binary logic gate, is tied to logical 0 (resp. 1). However, while the actual value of the signal is not relevant in the stuck-at models, the toggle model relies on the original value. More precisely, for toggle model, the signal is turned to logical 0 if the actual value is 1, and vice versa.
Recently, the authors of [RBSG21] proposed a new consolidated fault model which is described by a function ζ(n, t, l). Here, n defines the total number of fault events that can occur at the same time in one logic stage. The parameter t describes the fault type. More precisely, the authors proposed a set of fault mappings τ j which then describe the actual behavior of the fault. To model a fault injection in a target gate, the associated Boolean function is replaced by another function which is defined in τ j . In their work, they specify common fault models like stuck-at and bit-flip but also more advanced mappings that describe laser fault injections in the 15 nm Open-Cell Library. However, the last parameter l defines the valid fault locations in a given digital logic circuit. Particularly, the fault location limits fault injections to combinational (c), sequential (s) or both gate types (cs). In the remainder of this work, we also follow this notation and define an attacker by ζ(n, t, l).
Depending on the situation and the scenario, various factors, such as access to the faulty results, the precision of the fault injection, or the underlying cryptographic algorithm affect the final choice of the analysis technique. However, due to the efficiency of such powerful attacks, the research community has also dedicated a considerable body of research to propose methodologies counteracting fault injections. For this, all approaches and countermeasures commonly rely on redundancy in terms of area, time, information, or any combination of them.
For instance, an encryption function can be instantiated twice (or multiple times) to form a basic detection scheme (based on spatial redundancy) allowing to check the consistency of the outputs through a simple comparison [MSY06]. Another trivial way to detect the presence of a fault is recomputation (as temporal redundancy), i.e., the construction recomputes the output multiple times using the same dedicated function and compares the results [MSY06]. In [AMR + 20], a code-based approach based on Concurrent Error Detection (CED) schemes, i.e., information redundancy, has been proposed, where fault propagation in hardware implementations has been taken into account. More precisely, the authors guarantee the detection of any induced fault in any location of the design, including data path, Finite State Machine (FSM), control signals, and consistency check modules at any clock cycle. However, since the proposed technique is vulnerable against advanced attacks such as IFA and SIFA, the detection facility of [AMR + 20] was extended to fault correction in [SRM20] to also protect implementations against IFA and SIFA.
Additionally, the authors proposed important properties and guidelines to design resilient hardware countermeasures against fault injection attacks. The most significant criteria, called Independence Property, was introduced in [AMR + 20] and demands that a digital circuit is separated into independent parts such that each computes exactly one output bit. Then, a checkpoint is placed at the output of each separate part ensuring to detect or correct any fault within the capabilities of the underlying countermeasure. To this end, introducing a checkpoint after each non-linear function ensures to stop fault propagation as early as possible and prevents unnecessary complexity when designing large circuits fulfilling the independence property over several non-linear functions. Hence, in the context of designing countermeasures for block ciphers, a checkpoint should be introduced after each substitution layer which ensures to detect occurring faults in each round.
A couple of more techniques have been proposed to protect against SIFA [BKHL20, SJR + 20, RAD20]. A combined countermeasure against Side-Channel Analysis (SCA) and SIFA has been introduced in [DDE + 20]. In this approach, the non-linear layer should be implemented by Toffoli gates and the whole design should be masked. Then, the authors claimed that a simple duplication can prevent a single-fault SIFA. In [BBB + 20], a randomized duplication-based approach is presented, where no masking is needed.
State-of-the-Art Fault Verification. Practical evaluation of countermeasures against fault attacks on physical devices and real products is a complex and time-consuming task and needs considerable expertise and experience. Hence, this certainly highlights the necessity of verification tools and automated analysis techniques to accelerate evaluation and assist designers in analysis of countermeasures. Moreover, it can help to reduce the cost of the fabrication process while maintaining the desired level of security.
In 2017, the authors of [BGE + 17] presented a tool for automatic construction of algebraic fault attacks called AutoFault. AutoFault works on gate-level netlists and uses a SAT solver to detect possible vulnerabilities in a given design without deeper knowledge of the cipher construction. However, the user has to define a list of fault locations which limits evaluations to the corresponding areas of a target design. Hence, if AutoFault is used to verify countermeasures against fault attacks, the tool could report false-positive results since a full coverage of all possible fault events is impossible.
In [KRH17], a framework, called XFC, for fault characterization in block ciphers was presented. It receives the block cipher specification and a fault model in order to determine locations for fault injection during the execution of the encryption. By tracing the fault propagation and its effects on the ciphertext, the tool evaluates the exploitability of a fault in terms of DFA and returns the computational complexity of the recoverable part of the (round) key. However, while XFC is mostly limited to a specific class of DFAs, ExpFault [SMD18] is designed to cover even more fault analysis techniques. Unfortunately, even though both tools can help adversaries to find the optimal location to inject faults and facilitate fault attacks, they are not able to assist designers in assessing the security of implementations equipped with fault attack countermeasures. As a consequence, this issue was addressed in a framework called SAFARI [RRHB]. More precisely, this framework uses XFC to automatically identify locations that can be exploited by fault injection attacks, given a description of the (unprotected) target block cipher in a dedicated block cipher specification language. Then, based on user defined security levels, SAFARI automatically equips the given block cipher with a parity or redundant-based countermeasure and returns HDL or C code accordingly. Recently, another work that focuses on the exploitability of fault injection attacks on microcontrollers was presented at CHES in 2019 [HBZL19]. The authors propose a tool called TADA which automatically detects vulnerabilities of a block cipher software implementation on assembly level and returns exploitable faults with the help of an SMT solver.
Unfortunately, all aforementioned works aim to detect exploitable fault injection attacks on hardware or software implementations of block ciphers, hence taking an adversarial perspective. Only [RRHB] additionally applies protection mechanisms to susceptible areas. However, none of these works take the perspective of the designer in targeting the assessment and formal verification of countermeasures against fault injection attacks implemented for hardware devices. More specifically, all those approaches are not able to verify a given design considering all possible fault events that could occur under all valid input combinations.
There are a few works addressing this topic by proposing open-source fault simulators for fault diagnosis [NCP92,LH96,BN08]. However, their application to cryptographic fault analysis is quite limited as they can simulate only a single-bit fault injection. As a result, VerFI [AWMN20] is the first automated open-source cryptographic fault diagnosis tool designed to evaluate fault-protected cryptographic implementations. For this, the tool directly operates on the gate-level netlist of a hardware design and is able to assess detection, infection, and correction-based countermeasures. Moreover, the user is able to define a fault model, a bounded adversary model, the location of the faults, the desired clock cycles for fault injections, and some input test vectors for simulation. Then, the tool simulates the circuit considering the parameterized fault injection and provides the coverage for every test vector and a final overall result including the total number of faults, all the non-detected faults per input test vector, reporting the location and type of faults, and the corresponding faulty outputs, which may assist the designer to identify the design failures.

Limitations of VerFI.
Although VerFI facilitates the verification of fault-protected implementations, the result of the analysis depends on the selected input test vector(s). Hence, it is possible that the tool indicates the security of a design under a certain set of test vectors, while different test vectors would lead to observable or exploitable faults.
For this, let us consider a simple PRESENT S-box implementation [BKL + 07], protected by a single bit of parity, as depicted in Figure 1. More precisely, the S function receives a 4-bit input S in = a, b, c, d and provides the 4-bit S-box output S out = x, y, z, t , where a and x are most significant bits. Simultaneously, the redundant part S operates on the 4-bit input, independent of S, and estimates the parity bit of the S-box output. Eventually, the consistency check module verifies the correctness of the S output given the estimated parity bit and indicates a fault in case of inconsistency. However, such a design is not necessarily secure against single-bit fault injection in case fault propagation occurs in the S function. In other words, for some test vectors, an attacker can inject a single-bit fault in such a way that an even number of faulty bits appear at the output, hence no opportunity to be detected by the consistency check module. One of such cases is shown in Figure 1, where a single-bit fault propagates to two different output bits (x and y) depending of input test vector, i.e., S in ∈ {0x1,0x2,0x3,0x9,0xA,0xB}. More precisely, the tool reports all single-bit faults are detected when S in ∈ {0x0,0x4,0x5,0xD,0xE,0xF} and there is at least one non-detected fault in the rest test vectors due to fault propagation. To mitigate this issue independence property has been defined in [AMR + 20] to guarantee an n-bit induced fault affect at most n-bit output bits. To this end, no cell should be involved in the computation of multiple output wires. As one can see, this property is not fulfilled in the given example, leading to insecure implementation for some test vectors in the underlying adversary model.
Hence, VerFI confirms the security of the design if the evaluation is only based on a limited number of test vectors, while additional test vectors could reveal the flaw. This behavior becomes even more challenging with increasing circuit size and number of inputs, as using VerFI it is almost impossible to check all input combinations for such a faultprotected cryptographic design. As a consequence, this highlights the importance of an automated tool that does not rely on simulation of test vectors but symbolically checks all possible cases under given fault models.

Verification Concept
Before we present our verification approach in more detail, we introduce appropriate models for fault events and describe important terms required for diagnosing fault effects. The proposed verification approach covers the generation of circuit models, symbolic fault injection, and the corresponding fault diagnosis.

Fundamental Terminology
Formal verification of security requires formal descriptions and definitions of adversary models and security properties. More precisely, given capabilities and limitations of an adversary model, formal verification can prove security properties of any design under verification in the presence of the given adversary models. To this end, we briefly outline fundamentals of our basic fault injection models as follows.
Fault Events. Any accidental condition that results in a malfunction or misbehavior of a digital logic circuit is considered as a fault event. However, while environmental faults have an erratic and random nature, adversarial faults are often precisely located and purposely injected into the circuit. Depending on their retention time, fault events can be classified as transient, persistent, or permanent. While transient fault events have a dynamic nature and volatilize after certain periods or changes in the circuit, elimination of persistent fault events requires a full reset of the circuit or system, whereas permanent faults are of static nature and will remain permanently. However, when modeling fault events, considering only transient fault events is sufficient, as any persistent or permanent fault event can be modeled as a repetitive transient fault event.
Observing that digital logic circuits are used to implement the computation of arbitrary Boolean functions f : F i 2 → F o 2 , any fault event in such a digital logic circuit can be precisely modeled by another Boolean function f : F i 2 → F o 2 as the authors of [RBSG21] proposed. More precisely, we model fault events at the structural level of logic circuits, assuming logic gates as atomic components, while the misbehavior of a single logic gate is considered as a fault event. As a consequence, changing the functionality of the misbehaving logic gate in the context of the entire circuit results in a clear specification of the faulty function f that can be compared to the golden, i.e., fault-free, function f. Fault Positions. Given our adversary model, as outlined in Section 2.4, the adversarial capabilities are mostly determined and limited by the number of fault events that can be purposely injected into a single evaluation of a digital logic circuit 3 . More specifically, any fault injection might be limited and constrained in spatial or temporal dimension. For the spatial dimension, we mostly distinguish between combinational and sequential logic gates that are considered as source for transient fault events. Further, in addition to the spatial locations, each adversary is also limited in the number of fault injections, i.e., the number of fault events that can be caused simultaneously within the same clock cycle. In fact, for the temporal dimension, we distinguish between univariate and multivariate fault injections. In that sense, univariate fault injections only consider fault events occurring in the same clock cycle, while for multivariate fault injections, fault events can occur in different clock cycles.
As a result, the total number of possible fault events, ultimately describing and limiting the adversarial capabilities, is derived as the product of the spatial and temporal limitations. This means, given n fault events in spatial dimension, and v fault events in temporal dimension, the total number of fault events that is injected into a single circuit evaluation is yielded by n × v. Further, depending on the adversary model and if necessary, the distribution of fault events can be adjusted, e.g., according to a uniform or biased distribution. However, whenever possible, we opt for an exhaustive fault verification, hence, allowing to cover any possible fault distribution.
Classifying Fault Effects. As indicated before, diagnosis of fault events and effects requires knowledge of the expected behavior. As a consequence, comparing the faulty behavior to the expected behavior of a golden circuit allows to evaluate and examine the fault effectivity.
In the context of pure fault-detection countermeasures, fault handling is delegated and escalated to the system. More precisely, in such a context, the circuit under diagnosis might expose a misbehavior that is detected and clearly communicated as such to the system level. As a consequence, for fault-detection countermeasures, we usually distinguish between ineffective, detected, and effective faults. For this, each fault event that does not lead to an observable misbehavior is classified as ineffective, while all fault events that clearly lead to a misbehavior that is not detected by the circuit are marked as effective faults, leaving the remaining events in the class of detected fault events. In contrast to this, fault-correction countermeasures attempt to correct any detected misbehavior immediately such that only ineffective or effective faults can be observed.

Verification Approach
In the following, we present our verification approach for fault injection countermeasures on hardware devices. More precisely, we explain how we use a Verilog gate-level netlist of a digital logic circuit to create an appropriate model. This model is used as a foundation to introduce techniques using BDDs to perform efficient evaluations of fault events. Eventually, we provide more insights of optimization strategies that allow supporting larger circuits.

Requirements for Cryptographic Fault Verification.
There are some simulation tools [NCP92,LH96,BN08] in the field of integrated circuits testing, also known as reliability analysis, that examine the working environment stress, e.g., thermal cycling and vibration, or the potential manufacturing failures in a chip. However, they are not suitable for cryptographic fault analysis and verification of fault-protected implementations as they are commonly limited to single-bit faults. Moreover, often the user cannot set different fault models or specify desired locations for fault injection. It becomes even more challenging if the evaluated design incorporates dedicated countermeasures against fault attacks. In particular for detection-and infection-based countermeasures, the design returns a fixed value or a random value completely unrelated to the secret key if a fault event was recognized. Hence, the evaluation tool must be able to anticipate the behavior of the design and the integrated countermeasure for correct evaluation. While all these facts are supported by the recently-introduced fault-diagnostics tool VerFI [AWMN20], the result of such an evaluation is based on the given test vector(s). This may lead to a false-positive result. Namely, some faults may appear at the output only for certain input values and might be not detected by every test vector, like the example provided in Section 2.4. In this work, we mainly focus on cryptographic fault analysis that naturally covers every possible test vector, avoiding false positives.
Abstraction Levels. The definitions introduced in Section 2.3 allow us to introduce two abstraction levels structural and functional. The structural level incorporates the edges and vertices of the DAG, i.e., the wires in the circuit C connecting the Boolean gates from G. In a verification environment, the structural level is used to define and distinguish different areas of the original circuit, e.g., the register stages or dedicated modules that should be considered in an analysis. Furthermore, the structural level gives us the opportunity to develop special optimization strategies as we describe later in this section. However, the actual faults are injected at the functional level -directly in the combinational or sequential gates. At this level, we can precisely and generically cover several known fault models summarized in Section 2.4. Figure 2 depicts the verification approach which we follow in this work. As already mentioned above, we analyze hardware circuits based on their (Verilog) gate-level netlist. In the first step, the netlist is transformed into the circuit model introduced in Section 2.3 and therefore converted into a DAG D. The underlying data structure allows us to perform several preprocessing steps at the structural level, as follows.

From Netlist to Direct Acyclic Graph.
• First, each node d ∈ D is attached with an information holding the gate type. It is accessed by the function type (d) and returns one of the following values from G t .
G t = {in, out, not, buf, reg, and, nand, or, nor, xor, xnor} • Second, dependencies between the existing nodes in D are identified. To be more precise, each node d ∈ D is equipped with its propagation path, i.e., with a list of nodes that are influenced by the output of d.
• Third, all nodes in D are separated into two classes depending on whether the corresponding logic gate g is from G r or from G c , i.e., g is whether a sequential or combinatorial gate, respectively. We access this information for a given node d by the function location (d).
• Forth, the structural level is perfectly suited to extract topological characteristics of the underlying circuit. This includes the assignment of each node d ∈ D to its logic stage.
A single logic stage consists of all combinatorial gates between two successive register stages. Special cases are 1) the first logic stage where the combinatorial gates are between the primary inputs and the first register stages, and 2) the last logic stage where the combinatorial gates are between the last register stages and the circuit's primary outputs. We use the function stage (d) to refer to the logic stage of node d. Symbolic Simulation using BDDs. The next step in our verification approach consists of mapping the Boolean function associated with each node d ∈ D to a BDD which includes the entire subgraph spanned by the node d. Therefore, the DAG is topologically sorted and each node d is evaluated starting from the primary inputs. For each primary input, i.e., type (d) = in, a new BDD variable is introduced. For all remaining nodes d ∈ D a BDD is constructed from the fan-in BDDs, based on the Boolean function associated with the node d. As an example, let us assume that a node d is associated with an and-gate. Therefore, d has two input edges connected to two previous nodes which have already been evaluated (due to the topological sorting) and the corresponding BDDs have been constructed. Then, the BDD for d is constructed by computing the logical and of both fan-in BDDs based on the concept given in Section 2.2.3. We decided to select BDDs as the underlying data structure to model a digital logic circuit since they offer many advantages. First, BDDs were originally proposed for defining, analyzing, testing, and implementing digital Very Large Scale Integration (VLSI) circuits [Ake78]. Therefore, they seem to be a natural choice for our application, i.e., verifying hardware countermeasures against fault injection attacks. Second, one core idea of BDDs is to work with symbolic simulations which inherently consider all possible states of the BDD variables. Hence, the verification of a digital logic circuit is not limited to a predefined set of test vectors (inputs) but rather all valid inputs are tested and evaluated. This procedure avoids false positives as discussed in Section 2.4. Third, since executions of Boolean functions over BDDs are elementary operations (cf. Section 2.2), injecting faults by exchanging the associated Boolean function of the target node in the DAG D with a faulty one is a straightforward task that results in simple re-computation of the corresponding BDD. [RBSG21], i.e., modeling faults by replacing the original Boolean operation f of a target gate g ∈ C with another Boolean operation f chosen from the same set of functions according to a predefined mapping τ .

Symbolic Fault Injection. Our concept of symbolic fault injection is based on the very generic approach presented in
In the context of our verification approach, when analyzing the effect of a fault injection in a target logic gate, the corresponding graph node d ∈ D is replaced with another graph node d , according to a fault mapping τ . However, since each graph node is explicitly associated with a Boolean operation, the replacement of the graph node not only changes the structural description of the circuit node but also affects the functional behavior. More precisely, while d is associated with a Boolean operation f, the replaced graph node d is associated with a different Boolean operation f , necessitating a re-generation of the BDD for all subsequent graph nodes (affected by the fault injection). Note that for the remainder of this work, we denote the structurally modified DAG D as the faulty model, while we refer to the original, fault-free DAG D as the golden model.
In fact, our verification approach is designed to analyze the golden model D under all possible fault events that can occur for a given fault model ζ(n, t, l). More specifically, the fault model ζ determines the fault verification process in terms of the number of graph nodes n that should be faulted, the fault mapping τ that is considered to replace the target nodes, and which circuit location l should be considered (i.e., combinational logic, sequential logic, or both). Therefore, in the first step, our verification approach creates a set of nodes Λ. Particularly, the nodes in Λ are extracted from the golden model D according to the considered location parameter l, such that In a next step, the nodes in Λ are separated into s subsets θ i (s denotes the total number of logic stages) holding all nodes belonging to the same logic stage, i.e., each subset θ i is defined as θ i = { λ ∈ Λ | stage (λ) = i } for 0 ≤ i < s. Therefore, the set of all valid gates λ ∈ Λ categorized based on logic stages is noted by Θ = {θ 0 , θ 1 , ..., θ s−1 }. In particular, such a categorization allows to distinguish between univariate and multivariate fault injections, i.e., different fault injections with respect to the temporal dimension.
Besides considering the location parameter l and the separation in the temporal dimension to create valid sets of nodes that should be faulted, we further consider the number of fault events n that should be injected simultaneously in a single subset θ i . Therefore, we introduce the sets Γ i for 0 ≤ i < s which hold all combinations of n nodes that are available in a single subset θ i , formally defined as Note, however, that the cardinality of each Γ i , i.e., the number of valid combinations of nodes that need to be evaluated in each logic stages, drastically increases in |θ i | and n since |Γ i | = |θi| n . Next, given a valid set of target nodes γ ∈ Γ i , each node in γ = {d 0 , . . . , d n−1 } is associated with a Boolean operation f which is replaced by faulty operations according to the fault type t defined in ζ(n, t, l). In particular, the fault type t (e.g., bit-flip, set, or reset) is defined and described by a fault mapping τ (cf. [RBSG21]). For example, a fault mapping for the gate type and could be defined by τ : {and} → {set, reset, nand}. Hence, the number of different fault mappings that can occur for one γ depends on the cardinality of the corresponding fault mapping and is determined by Eventually, each of these valid combinations leads to a new faulty model D which should be compared to the golden model D to determine the effect of the injected fault (more details are provided in the subsequent paragraph).
As already mentioned above, our verification approach considers univariate and multivariate fault injections. For univariate fault injections, faults are injected in only a single set Γ i . In contrast, for multivariate fault injections, v different sets Γ i are selected, where v denotes the number of different logic stages that can be faulty at the same time (e.g., setting v = 2 would describe a bivariate fault injection). Note, that in each logic stage (temporal dimension), n nodes can be faulted such that v × n nodes of the golden model D are affected. Therefore, analyzing multivariate fault injections drastically increases the combinations of nodes that need to be evaluated. More precisely, each selection of v different sets creates v Γ i valid combinations.
Fault Diagnosis. The ultimate goal of our fault verification approach is the diagnosis of fault effectiveness and severity (cf. Figure 2). For this, given a golden model D and a faulty model D , the effects of fault injection in D resulting in D , are evaluated by analyzing and comparing both models. Particularly, the output nodes of both models are combined to new BDDs (commonly by an exclusive-or which we highlight in more detail in Section 4.2) in order to detect any differences in the outputs considering all valid assignments of the primary input variables. This strategy is especially beneficial for the data-structure of BDDs since counting the number of satisfying variable assignments, i.e., leading to a logical 1, can be accomplished efficiently. Hence, determining and counting the satisfiability of the BDDs combining the outputs of the golden and faulty models, directly yields the number of input combinations leading to a difference in both models.
More precisely, based on the analyzed fault-injection countermeasure, incorporated in design under test, the combined BDDs of the golden and faulty models are used to determine the number of effective, ineffective, and detected faults, as well as the total number of fault events as introduced in Section 3.1. Note, however, that the exact fault diagnosis procedure depends on the underlying countermeasure which we discuss in more detail in Section 4.2.
Optimizations. As already indicated before, this verification approach poses some challenges with respect to the complexity when analyzing large circuits or when the number of fault injections n increases. Therefore, we further propose two optimization strategies which both rely on the structural analysis of the circuit model while being independent of the functional behavior of the circuit, i.e., the realized logical function.
The first strategy benefits from the identification of fault propagation paths, which are determined in the initialization phase. More precisely, the propagation paths are determined by a backwards-iterating algorithm given a topologically sorting of the DAG D. Hence, in a breath-first search the algorithm considers each node d ∈ D and adds the propagation paths of all nodes d i connected to the output edges of d along with the node d i itself. This procedure generates topologically sorted lists of propagation paths since the nodes from the deepest logic levels are added first. Then, assuming a target node λ ∈ Λ is faulted, i.e., the associated Boolean function is replaced, we can observe that not all BDDs associated with nodes d ∈ D need to be re-evaluated. In particular, evaluating only the nodes that are located on the fault propagation path is sufficient and reduces computational overhead especially when injecting faults on gates in deeper logic levels.
The second optimization strategy reduces the number of nodes that are tested in the evaluation phase (cf. Figure 2). This is achieved by creating a subset Λ red ⊂ Λ which is used instead to generate the valid combinations of target nodes in Θ, using the following ideas and observations. First, registers form synchronization points in a digital logic circuit C (cf. Definition 4) and occurring fault events will eventually manifest in such register stages. Hence, it is straightforward that all nodes in Λ associated with a register in C also need to be included in Λ red . Second, all nodes d ∈ D associated with gates that are directly connected to registers (i.e., whose output edges are connected to a register input) are added to Λ red as well since they influence the behavior of the synchronization points immediately. Third, we observed that most digital logic circuits have sensitive gates directing faults from several locations through the circuit, eventually manifesting in registers. More precisely, we cluster gates to Boolean functionsf : F i 2 → F 1 2 with i > 1, i.e., to Boolean functions that providing only single-bit outputs. The output gates of such clusters symbolize sensitive gates and fault propagations within such clusters are local and always pass them. Hence, fault injections in these clusters can be modeled by considering the sensitive gates only. Therefore, we add all nodes associated with sensitive gates to the reduced set of nodes Λ red .
Note, this approach selects cluster of gates in a conservative fashion, i.e., each gate

Algorithm 1: Complexity Reduction
Input : Golden circuit model D, set of valid fault location (nodes) Λ Output : Set of reduced fault locations Λ red with fan-out greater than one is treated as a sensitive gate regardless of the fact that output signals are may re-combined by another gate such that they only influence a single wire. Additionally, analyses using the reduced set of nodes Λ red should only be performed in the bit-flip model, i.e., ζ(n, τ bf , l). Again, this conservative approach models a worst-case scenario and ensures that a fault event is definitely effective. Hence, both restrictions guarantee full coverage of all possible fault events that otherwise may occur in a non-reduced set Λ. However, switching to the introduced circuit model D, nodes associated with registers and nodes directly connected to registers can be extracted from D in a straightforward way. All nodes d ∈ D associated with sensitive gates are identified by iterating over all nodes and extract each node d with more than one output edge. All three steps are formally defined in Algorithm 1 describing the complete process of generating the reduced set of nodes Λ red . Here, line 6 adds all nodes associated with registers to the reduced subset Λ red while line 11 considers the input nodes directly connected to the registers. Eventually, line 20 adds all nodes associated with sensitive gates to Λ red .
Finally, we visualize the determination of sensitive gates and corresponding clusters of gates by an exemplary circuit depicted in Figure 3. All together, the exemplary circuit consists of five clusters denoted by c 0 , ..., c 4 . While gates g 5 , g 6 , and g 7 influence the existing registers directly, they also represent sensitive gates since faults always propagate through them. However, cluster c 4 consists of gate g 4 and g 2 where only g 4 is considered in a verification approach where Algorithm 1 is applied. To summarize, with the presented reduction approach it is enough to cover Λ red = {regs, g 1 , g 4 , g 5 , g 6 , g 7 } instead of Λ = {regs, g 0 , . . . , g 7 }.

The Tool
In this section, we present our fault verification tool FIVER (Fault Injection VERification) which realizes the approaches and concepts from Section 3. For this, we briefly introduce the applied BDD library and explain the general tool flow. Finally, we present some optimization strategies to improve the overall performance of our tool.

Colorado University Decision Diagram (CUDD) Package
The Colorado University Decision Diagram (CUDD) package is a BDD library developed by Fabio Somenzi at the University of Colorado [Som18]. The library is written in C but provides an interface to C++, used by our tool. Besides a large set of BDD operations offered by CUDD, it provides a large assortment of variable reordering methods. These methods allow reordering BDD variables such that the size of the underlying BDD is optimized. This is especially beneficial when the size of the evaluated circuit increases.

Tool Flow
In this section, we introduce our fault verification tool in more detail. To start the analysis of a target design, a configuration file needs to be provided first. Afterwards, the internal tool chain is evoked and executed. At the end of the tool chain, an evaluation function is called in order to determine the number of effective, ineffective, and detected faults and generate the final evaluation report.

Configuration.
Our fault verification tool uses a configuration file to specify and execute the desired analysis. This configuration file includes and sets parameters controlling the execution environment and host resources (e.g., Central Processing Unit (CPU) cores or memory) as well as the fault model parameters, including the number of fault injections n, the number of simultaneous fault injections v in temporal dimension (e.g., univariate, bivariate, etc.), the location parameter l, and whether the complexity reduction approach should be applied or not. Furthermore, the definition of the fault mapping τ needs to be provided allowing the software to consider custom fault mappings for evaluation and diagnosis. However, along with the software on GitHub 4 , we provide template definitions for common fault mappings (e.g., bit-flip, set, reset). Finally, a reference to a blacklist of entity names can be provided, excluding all matching modules from the fault injection process during the evaluation phase.
Tool Chain. The tool chain is guided by the verification approach introduced in Figure 2. Hence, the Verilog netlist of the target design is parsed first. The outcome is an intermediate representation of the circuit containing the gate type, the list of input nodes, and additional annotations. Based on this intermediate representation, the DAG D of the golden model is generated. Besides, as this function already processes the intermediate representation, the annotations are used to identify the blacklisted entities. Afterwards, the CUDD library is used to process a topologically sorted representation of the DAG D and creates a BDD for each node d ∈ D based on the associated type. Further, if the configuration file enables the complexity reduction, Algorithm 1 is evoked, while the initialization phase is concluded by extracting all related graph properties including the number of logic stages, propagation paths, and nodes that need to be considered during the analysis. During the subsequent evaluation phase, a fault verification function is called by passing the fault model parameters, the list of valid nodes, the golden model, and a BDD manager required for the CUDD library. All together, this function handles the most workload by iterating over four nested loops considering the number of fault injections n, the distinction in the temporal dimension, over all valid nodes, and finally on the lowest level over the defined fault mappings in τ . Note that n only determines the upper bound for the number of simultaneous fault injections, i.e., fault injections smaller than n are considered in the analysis as well. This procedure is very common in evaluation of countermeasures against fault injections and is done in the same way in related works [SMG16,RSBG20]. However, on the lowest level, the tool performs the actual fault injection by replacing the types of the target nodes resulting in the faulty model D . However, as in most detection-based countermeasures, an additional error flag indicates if a fault was detected, this information can be used to distinguish effective and detected faults. Particularly, if the BDD E of the error flag produces a zero while B generates a one, a fault injection leads to an effective (undetected) fault. Consequently, in case E and B leading both to a one, a fault was successfully detected by the design. As already introduced in Section 3, the data-structure of BDDs naturally covers all combinations for the given BDD variables. The number of combinations leading to a true assignment in a given BDD can efficiently be determined by a function counting the minterms which is also provided as part of the CUDD library. Hence, the number of effective faults is determined as countMinterms(B · E) while the number of detected fault is obtained by countMinterms(B · E). Knowing the total number of fault events, the number of ineffective faults can be easily calculated by subtracting the number of effective and detected faults.
Note, that if countermeasures without an error flag should be analyzed, the evaluation function, i.e., the function that combines the output BDDs, needs to be adapted. However, this is easily possible without any deeper knowledge of the applied BDD library.
Report. As a final step, the tool reports all verification results in a text file. This includes a summary of the number of effective, ineffective, and detected faults, as well as the total number of fault scenarios that were tested 5 .
Besides, for each detected effective fault, a clear description of the fault is added to the report. More precisely, all faulted gates leading to effective faults are listed as well as the function used to model the fault injection. This allows the designer to accurately determine the cause of the effective fault event in order to fix the flaw in the evaluated countermeasure.

Optimizations
In Section 3.2, we already introduced two optimization strategies based on determining the propagation paths of all nodes in D and on the reduction of the number of target nodes that need to be faulted, which we called complexity reduction. Besides those two approaches, we applied further optimizations which are directly related to the tool.
Incremental Faulting. The first approach optimizes the application of replacing the Boolean functions defined in a given fault mapping τ . It is only effective for analysis with n > 1 and for nodes with |τ (d)| > 1, i.e., for nodes that are changed to more than one function modeling a fault injection. Therefore, let us consider a current state of the faulty model D where n nodes are faulted. The tool would step on to the next valid set of fault mappings. But instead of just starting from a new golden model, the tool computes the difference between the previous applied fault mappings and the new fault mappings. Hence, if for example only the fault mapping for one single node changes, only the type of this node is adapted (triggering a re-evaluation of related BDDs) and not all BDDs associated with the remaining n − 1 need to be re-evaluated. This incremental faulting approach can notably reduce computational time since the number of evaluations can be reduced and the same fault events are not performed multiple times.
Resetting Faulty Model. The next optimization approach addresses the resetting of the faulty circuit. More precisely, after a set of valid nodes γ ∈ Γ i was faulted and analyzed, the tool proceeds with the next valid set γ ∈ Γ i . Therefore, the functions from the nodes γ in D need to be restored to the original functions defined in the golden model D. In a straightforward approach, the golden model D could just be copied to the faulty model D such that D is fault free and the fault injections into the nodes defined in γ could be performed. However, this process can be very time-intensive especially for larger models D and therefore for larger circuits C. Instead, we only change the types of the nodes defined in γ to the original types from the golden model D. Even though this procedure triggers a re-evaluation of all BDDs placed in the propagation paths of the nodes in γ , it turns out that this process increases the performance notably.
Multithreading. Finally, we parallelized the execution of our tool by using OpenMP 6 . The given problem is perfectly suited for parallelization since each set of valid nodes in Γ i can be evaluated independently. Therefore, the loop that iterates over the sets defined in Γ i is parallelized into the number of threads set up in the configuration file.

Case Studies
In this section, we apply the tool proposed in Section 4 to various cryptographic hardware implementations. More precisely, we evaluated detection-based and correction-based countermeasures against fault injection attacks attached to the lightweight ciphers CRAFT and LED as well as the full block cipher Advanced Encryption Standard (AES). All designs were taken from [AMR + 20, SRM20] 7 while we unrolled the designs and only evaluated one or two rounds of the given circuit (for sake of complexity). Further, to obtain the Verilog gate-level netlists, we used the Synopsys design compiler with version E-2010.12-SP2.
In the next two subsections, we first present the evaluation results of the considered case studies, before discussing the limitations of our tool with respect to the size of a given circuit and the applied fault models.

Evaluation Results
We start our experiments by evaluating the counterexample from Section 2.4. Due to the symbolic fault injection approach, our tool is able to detect the existing flaws in the design and reports the corresponding gates leading to the effective fault injections. However, we proceed our analyses with the lightweight cipher CRAFT [BLMR19] since it is built upon a simple structure leading to a small hardware footprint (i.e., a reasonable number logic gates). As a next step, we decided to analyze LED-64 which is also a lightweight cipher but has a more complex structure [GPPR11]. Eventually, we challenge our tool by evaluating an AES-128 as it is roughly 14 times larger than the LED design. All results are summarized in Table 1 obtained from a system running Ubuntu 18.04.2 with an Intel Xeon E5-1660 CPU with 3.2 GHz and 128 GB RAM. For all upcoming results, we fixed the number of threads used by our tool to eight while each thread could use up to 8 GB RAM (this is sufficient for the most analysis considered in this work). More details about the performance with respect to the number of used cores and amount of memory can be found in Appendix A in Figure 4 and Figure 5.
CRAFT. For CRAFT, we consider detection-based countermeasures for a single-round design (protected against 1-bit, 2-bit, and 3-bit fault injections), a two-rounds design (with the same protection levels), and a two-rounds design which is protected against multivariate 1-bit and 2-bit attacks. Additionally, we provide evaluation results for correction-based countermeasures for single-round designs protected against 1-bit and 2-bit fault injections. For the analysis of single-round designs protected by a detection-based countermeasure, we instantiate the fault model as ζ(n, τ bf , cs), i.e., we consider bit-flip faults in combinational and sequential gates. The number of injected faults n is adjusted to the countermeasure meaning that it is set to the maximum protection level of the considered design. The evaluations for the 1-bit and 2-bit designs is executed within 0.021 s and 1.496 s, respectively. However, the evaluation of the 3-bit design is more challenging because more than 90 million combinations need to be tested. Without any complexity reduction, this evaluation takes roughly 50 min while the application of Algorithm 1 decreases the evaluation time to only six minutes. Note, that all these analyses are performed under all input combinations for plaintext and key, i.e., 2 128 valid inputs.
To demonstrate the functionality of FIVER, we also analyzed a subset of the provided countermeasures with fault models instantiated such that they describe fault injections exceeding the capabilities of our tool. The corresponding experiments are marked by red crosses in Table 1. As expected, the reports contain detailed lists of gates leading to effective fault injections.
The analysis of the two-rounds design gets more complex because each output depends on more primary inputs. While the BDD generations for the 1-bit and 2-bit design could be accomplished by the CUDD library and an evaluation could be executed without any complications, the structure of the 3-bit protected design is too complex such that the parsing and BDD generation process fails.
Next, we analyze the two-rounds design protected against multivariate attacks which consist of two register stages and therefore three logic stages. For the 1-bit protected design requires seven bits of redundancy resulting in an increased number of target gates. Nevertheless, our tool can analyze the 130 million fault combinations in under 1 h and validates the security of the design. Again, applying the proposed approach to reduce the complexity (i.e., Algorithm 1), reduces the number of fault combination to roughly 10 million while the simulation time is decreased to only 130 s.

LED.
In our next case study, we analyze a single-round design of LED-64 protected by detection-based countermeasures against 1-bit, 2-bit, and 3-bit fault injections. For all three designs, we selected ζ(n, τ bf , cs) as fault model to allow a fair comparison to the CRAFT case study. As for CRAFT, the 1-bit and 2-bit countermeasure can be analyzed in a few seconds although the evaluation time increases compared to the analyses for the CRAFT design. Switching to the 3-bit protected design results in over 1.6 billion combinations that need to be tested. Nevertheless, our tool is able to perform this evaluation in under 5 h without any complexity reduction applied. Enabling the complexity reduction reduces the evaluation time roughly by a factor of 185. We also tried to analyze a two-round design of LED-64 but due to the increased dependencies of the outputs on the primary inputs, we are not able to parse the circuit into BDDs.

AES.
In our last case study, we analyzed AES-128 protected by detection-based countermeasures against 1-bit and 2-bit fault injections. While the analysis for the 1-bit protected design can easily be managed by our tool (in only 22.5 s), the 2-bit protected design is more challenging. Hence, due to the enormous amount of gates (over 34 000), the number of combinations drastically increases. Therefore, we are only able to analyze the design by applying Algorithm 1 to reduce the complexity. Even then, the evaluation takes roughly 5.5 d but, nevertheless, it is manageable by our tool.

Limitations
Given the results of the three case studies, we now identify limitations of our tool with respect to circuit sizes and fault models.
Circuit Size. In two cases (CRAFT two rounds, LED-64 two rounds) our tool is not able to parse the circuit into the proposed data structures, i.e., the construction of the BDDs does not terminate. These problems occur because the outputs of the given circuit and therefore the BDDs of the output nodes in D depend on too many inputs, i.e., the depth of the BDDs increases. More precisely, the two-round CRAFT design (equipped with 3-bit protection) only consists of 3 739 gates which is clearly not the limiting factor since larger designs (e.g., CRAFT correction, AES) can be processed by our tool. This leads to the conclusion, that the circuit structure (i.e., the realized Boolean function) instead of circuit size prevents the parsing into BDDs. For the two round LED-64 design, each output BDD depends on all 64-bit plaintext variables and on all 64-bit key variables. Hence, circuits with similar structures and dependencies are out of scope for our tool which, however, is expectable since otherwise common block ciphers could probably be broken. More precisely, if our tool could successfully parse a two-round LED-64 design, parsing an entire unrolled implementation of LED-64 would probably also possible since the dependencies in the cipher would not increase. Therefore, the whole cipher could be analyzed over all possible combinations of input variables, i.e., considering all valid plaintexts and keys.
Fault Model. Limitations with respect to the fault model naturally occur when the number of simultaneous fault injections n increases or multivariate fault injections should be analyzed. One of this limitation appears for the multivariate 2-bit protected CRAFT design under the model ζ(2, τ bf , cs) for v = 2. Evaluating this design without any complexity reduction would require to test more than 200 billion different fault combinations. For the given circuit size (i.e., 3 396 gates), this exceeds the capabilities of our tool. However, as already pointed out in Section 3, such a limitation naturally occurs due to the growth of the binomial coefficient. With the introduction of Algorithm 1, we can counteract this growth, but further improvement still remains an open research challenge.
Circuit Structure. As already indicated in Section 2, our tool is limited to unrolled digital logic circuits. This is mainly due to the underlying data structure of DAGs that does not allow any loops in our model. Therefore, a designer of a countermeasure has to unroll a target design before it can be processed by our verification tool.
Iterative Block Ciphers. Although our tool is mostly limited to the analyses of singleround or two-round implementations, we do not see major obstacles with respect to the verification of common countermeasures and the corresponding assertions. Particularly, when considering countermeasures based on linear error codes [AMR + 20, SRM20], the underlying scheme usually protects each round with the same mechanism. Hence, an evaluation of a single round (univariate) or two rounds (multivariate) would be sufficient to verify the correctness of a protection mechanism. Similarly, countermeasures that are based on duplication are often equipped with detection or majority voting modules positioned at the end of a cipher execution. Again, these schemes could be seamlessly verified by our framework, focusing the analysis to the last round of the target scheme.

Conclusion
In this work, we present a framework to verify the security of countermeasures against faultinjection attacks designed for ICs. Given a Verilog gate-level netlist, our tool relies on BDDs to model the underlying Boolean function of the digital logic circuit and uses symbolic simulation to avoid false-positive results while covering all possible input combinations. Further, assuming different fault models under consideration, our framework automatically identifies potential fault locations and performs a full analysis under all given models. Since complexity of evaluation of digital logic circuits increases with circuit size, we propose various performance optimization strategies, ranging from algorithmic to programming specific techniques.
Eventually, we conduct several case studies to demonstrate the application on real-world digital logic circuits implementing well-established countermeasures against fault-injection attacks. More precisely, we successfully analyze implementations of the lightweight ciphers CRAFT and LED as well as the widespread AES. In fact, our tool is able to analyze more than 90 million fault injections for a single round of CRAFT in under 50 min while still testing all 2 128 assignments of the primary inputs. Figure 4 shows the evaluation times for different number of cores used by our tool. The results were obtained for a single-round CRAFT design with four bit redundancy under the fault model ζ(3, τ bf , cs) and enabled complexity reduction. The memory limit for each CUDD manager was set to 8 GB.