Distribution of Signal to Noise Ratio and Application to Leakage Detection

. In the context of side-channel attacks, the Signal to Noise Ratio (SNR) is a widely used metric for characterizing the information leaked by a device when handling sensitive variables. In this paper, we derive the probability density function (p.d.f.) of the signal to noise ratio (SNR) for the byte value and Hamming Weight (HW) models, when the number of traces per class is large and the target SNR is small. These ﬁndings are subsequently employed to establish an SNR threshold, guaranteeing minimal occurrences of false positives. Then, these results are used to derive the theoretical number of traces that are required to remain below pre-deﬁned false negative and false positive rates. The sampling complexity of the T-test, ρ -test and SNR is evaluated for the byte value and HW leakage model by simulations and compared to the theoretical predictions. This allows to establish the most pertinent strategy to make use of each of these detection techniques.


Introduction
Leakage detection consists in identifying the information leaked by a device when processing a sensitive data [WO19].This information can then be employed in a second phase for template [CRR03] or machine learning [CDP17] attacks.This is particularly relevant for selecting Points of Interest (POI) in a template attack [DS16].In this paper, we focus on detecting information leakage and do not address its exploitation.Moreover, only univariate methods that do not combine different samples from one trace are considered.Leakage detection procedures decide for one of the two hypotheses: • H 0 : there is no evidence of leakage within the trace • H 1 : there is some leakage where a model is given to characterize the leakage of information and a metric is defined to decide for one of the two hypotheses.
Hamming Weight (HW) and byte value are two commonly used leakage models in the existing literature [MOP08].The HW model assumes that the deterministic part of the leakage produced is in a linear relation with the HW of the processed data.For example, it is applied to model the signal when reading or writing data on memory via a communication bus [MOP08].On the other hand, the byte value model assumes that the amount of leakage produced depends on the value of the processed byte and thus varies for all byte values.It is more generic than the HW model since it makes fewer assumptions.
The T-test [CdMG + 13], χ 2 -test [MRSS18], ρ-test (Pearson's correlation coefficient) [DS16], Signal to Noise Ratio (SNR) [Man04] and Mutual Information (MI) [MOBW13] are the most popular techniques used to detect information leakage.These detection methods have different properties and usage conditions.While T-tests and χ 2 -tests indicate the presence or absence of leakage, they do not offer information on its exploitability.Leakage detection with MI is appealing because it is leakage-model-agnostic, meaning that it is robust to wrong a priori leakage model assumptions [VS09].In addition, it is a good predictor of security against differential power analysis [SMY06] [MOS09].However, it needs an accurate estimation of the signal's probability density function (p.d.f.) which is known to be a computational intensive task [MOBW13].This is the reason why this technique is not addressed in this paper.
The ρ-test detects univariate linear dependency between a leakage model and the trace.SNR is also an univariate kind of test that detects dependences located at firstorder statistical moment.Unlike T-test and χ 2 -test, SNR provides information about exploitability of the leakage since it is linked to the mutual information through the capacity of the observation channel (C = 1/2 log 2 (1 + SN R)) and the success rate [dCGRP19].Similarly, ρ-tests is also informative because it is related to the SNR [Man04].However a mathematical leakage model must be defined before performing the computation.The SNR method is thus attractive because it is informative and does not need to make restrictive assumption on the leakage model.
Once a metric is chosen, the evaluator must interpret the results in order to decide for hypothesis H 0 or H 1 .Practically, the result is compared to a detection threshold γ in order to decide whether or not to reject the hypothesis H 0 .The value of γ is set according to a pre-determined false positive rate (also known as false alarm rate), α = ∞ γ p(x|H 0 )dx, where p(x|H 0 ) is the probability density function (p.d.f.) of the decision metric under the hypothesis H 0 [Poo94, Chapter 2].Though an important performance criterion is not taken into account: the false negative rate β = γ −∞ p(x|H 1 )dx.A false negative (also known as missed detection) happens when the decision metric is smaller than γ while leakage is present (H 1 ).One would like to minimize both α and β, but it is known from detection theory that a trade-off must be made.If γ is increased α will be reduced at the expense of β.Usually, the number of traces N is the parameter that allows to satisfy the two constraints.In fact, the variance of the metric is usually inversely proportional to N (e.g.sample mean variance).If γ is fixed and N is increased, the p.d.f.p(x|H 1 ) will concentrate around its mean value and β decreases.In order to estimate the false negative rate, the probability distribution p(x|H 1 ) must be known.In a nutshell, we are seeking for each detection metric a theoretical formula for p(x|H 0 ) and p(x|H 1 ).
Under the hypothesis H 0 , the p.d.f. is known for the T-test [CdMG + 13], χ 2 -test [MRSS18], ρ-test [Man04] and SNR for the byte value model [BDGN14] [CK14].Under the hypothesis H 1 , the p.d.f. is known for the ρ-test [Man04] and the T-test (with equal class sizes) [WO19].Consequently, the p.d.f. of the SNR is unknown for the byte value model under hypothesis H 1 and the HW model under both hypothesis H 0 and H 1 .The motivation of this article is thus to derive these formula and exploit them.
The contributions of this paper may be summarized as follows: 1. First, we exhibit the p.d.f. of the SNR for the byte value and HW models and under H 0 and H 1 hypotheses.To do so, we make use of a Gaussian approximation, that we mathematically and experimentally validate under a small SNR assumption.The proposed approximation allows to obtain a light and handy p.d.f.formulation which helps for instance to link in an exploitable way α, N and the number of classes.This is not possible with the F-distribution formula because it mixes all these parameters in a complex way and its inverse is not easily tractable.
2. Second, the obtained probability distributions are used to derive the theoretical number of traces that are required to remain simultaneously below the false positive and false negative rates α and β.This evaluation of the sampling complexity is made for the SNR and ρ-test methods.This is an extension of the work described in [WO19] for the T-test.4. Finally, observing that the obtained results are in accordance to the theoretical predictions, we make use of such theoretical models in order to establish the most pertinent strategy to make use of each of the T-test, χ 2 -test, ρ-test and SNR-based detection techniques.

Paper organization
Section 2 provides a definition of the Signal to Noise Ratio (SNR) and an explanation of the methodology used to compute various probability density functions (p.d.f.).In Section 3, we derive the p.d.f. of the signal variance S, noise variance B and SNR Z.In Section 4, these models are validated using samples obtained from both simulation and experiments conducted on traces from the ASCADv2 dataset.Subsequently, in Section 5, the optimal threshold γ, i.e. the one that minimizes the false alarms rate, is calculated for each metric.This threshold is then used to derive the theoretical sampling complexity of each detection test.These models are finally compared to simulation results and validated with traces from the ASCADv2 dataset.
Notations The device processes a sensitive random data X resulting in a leakage T = f (X) + W where f (.) is the leakage function and W ∼ N (0, σ 2 ) an additive Gaussian noise.Typically, X is a 8, 16 or 32 bits word.If f (X) is the Hamming weight function, 'HW8' denotes a test when X is a byte and 'HW32' when X is a 32 bits word.While acquiring the n th trace, the device handles the sensitive variable x n which is a realization of X. Traces are classified based on the values of the leakage function f (x n ).A class is defined by one of the possible value of f (x n ).All variables x n having the same leakage f (x n ) belong to the same class.Given the assumption made on the leakage function f (.), there are K classes (e.g.K = 9 for the HW of a data byte).Also, as a notation abuse, we state that x n belongs to class k if f (x n ) belongs to class k.Ω k represents the set of indices of traces that belong to class k ∈ [0, K − 1].

SNR as a leakage detection method
A side-channel trace is a vector of non-invasive observations, such as power consumption or electromagnetic radiation, captured while processing a sensitive variable.For our theoretical work, we assume this vector includes only one element (univariate) to simplify notations.When the device processes the sensitive random variable X n belonging to class k, the leakage for the n th trace is: ) is an additive Gaussian noise.We define the subsequent random variables: the mean of the signal for all traces belonging to the same class k, denoted as M k = E[T n∈Ω k ] and its variance, denoted as The SNR is defined by [MOP08]: Z, S and B are random variables.The true SNR is defined by θ = var(f k ) σ 2 .Moreover, hypothesis H 0 and H 1 are defined as follows: In order to derive the p.d.f. of Z, the starting point of our work is the following property.If S and B are approximated by Gaussian random variables B ∼ N (µ B , σ 2 B ) and S ∼ N (µ S , σ 2 S ), then, the p.d.f. of Z is defined by [Sim02]: (2) where Q(x) is the tail distribution function of the standard normal distribution: Our goal is to utilize the theoretical formulation of p Z to determine the detection threshold γ and the sampling complexity of the SNR method.To do so, we will calculate the p.d.f. of S and B under hypotheses H 0 or H 1 and find their mean and variance.This information will be used to derive the p.d.f. of Z.

Hypothesis H 1
We assume that N traces have been acquired, with |Ω k | = N k traces per class.For the sake of generality, this number is not the same for all of the classes.Let also P (k) be the probability to draw a variable belonging to class k.According to the leakage model defined in the previous section, M k and S may be estimated in the following way: where M = I 1 and I 2 are defined by: f and N are defined by: It is decomposed into a sum of Mi 's: Mi is a sum of i.i.d.Gaussian random variable and is thus also Gaussian.Moreover, Mi 's are independent since they belong to different sets of traces and classes (Ω i ∩ Ω j = ∅ for i = j) and the noise variables W n are independent.Consequently, E k is a sum of independent Gaussian variables and thus follows a Gaussian distribution.
In addition, after some cumbersome computation, the covariance of E k and E j is given by: We also note that N k ≈ P (k)N which leads to the following approximation: The mean µ Ŝ and variance σ 2 Ŝ are computed with the ones of Mk and M .The result is given by Eq. 5.

Hypothesis H 0
When there is no leakage, it is possible to compute the probability density function of Ŝ without relying on the central limit theorem for large K. Proposition 2. When var(f k ) = 0, the p.d.f. of Ŝ is: where and Γ(.) is the Gamma function.
Proof.Ŝ is the sum of K independent random variables with non equal variances: where ).The p.d.f. of a sum of independent gamma random variables with unequal variances is defined by Eq. 6, with parameters given in Eq. 7 [Mos85].

Analysis of B
B is estimated as follows: where Mk is defined by Eq. 4. The analysis of B is the same under hypothesis H 1 or H 0 because the leakage variable f k is eliminated in the subtraction T n − Mk .Applying the same reasoning as in the previous section, variables T n − Mk are Gaussian and asymptotically independent for large N k 's.By application of the CLT, Vk converges towards a Gaussian distribution.The Vk 's are also independent because the T n 's are taken from different set of traces and the noise variables W n are independent.As a result, B is a sum of K independent Gaussian random variables and is thus Gaussian itself: B ∼ N (µ B , σ 2 B ).After some computations, the mean µ B and variance σ 2 B have the following expressions:

Hypothesis H 1
When Ŝ and B have Gaussian distributions, the p.d.f. of the SNR Ẑ = Ŝ/ B is defined by Eq. 2. This equation is challenging to apply in practical situations.We will demonstrate how it can be approximated by a Gaussian p.d.f. in our context (large K, large N and small SNR).Let us first focus on the first term of the Eq.2: We replace µ B and σ 2 B by their expressions defined in Eq. 10.We thus have: In practical situations, N is very large which leads to a very high value for will force I to zero.This first component can thus be assumed negligible.
Let now focus on the last term of Eq. 2: Let us define the following change of variable z = µ Ŝ µ B (1 + x) with x << 1 (small SNR region).With this modification, we obtain: where O(x) is the conventional big O notation.
As stated just above µ 2 B σ 2 B is very large in practice.Moreover the Q(x) function vanishes for large x.As a result A ≈ 1.This term can thus be also removed from Eq. 2. Eventually, the remaining equation is: For low SNR values, a small signal approximation of p Ẑ (z) gives: Hence, Ẑ approximately follows a Gaussian distribution with mean µ Ẑ and σ 2 Ẑ defined by: Under hypothesis H 1 , this gives: Under hypothesis H 0 , one has to set θ = 0 in Eq. 12.

Hypothesis H 0
Under the hypothesis H 0 , the SNR is proportional to the F-score [CK14]: Ẑ = N K F where F follows an F-distribution with K − 1 and N − K degrees of freedom.We will now derive a Gaussian approximation.From Eq. 10 we observe that σ 2 B is very small for large N .B is thus almost deterministic and Ẑ is a scaled value of Ŝ.This assumption is also valid for hypothesis H 0 since the distribution of B is unchanged: Ẑ ≈ Ŝ/µ B .Consequently, the p.d.f. of Ẑ is approximated by: where p Ŝ (z) is defined by Eq. 6.
When the number of classes K is large (e.g.K = 256), this p.d.f. is well approximated by a Gaussian distribution whose mean and variance are given by Eq. 11, 5 and 10, with 4 Models validation

Validation with simulations
Simulations are performed to verify the Gaussian approximation of Ŝ, B and Ẑ.Two leakage models are evaluated: • Stochastic linear leakage model [SLP05]: where X i is a random bit ('0' or '1' with probability 1/2) and ε i ∼ N (1, σ 2 a ).The vector (X 0 , • • • , X 7 ) is the binary decomposition of the class index k.In practice, the coefficients ε i 's are normalized so that var(f k ) = 1.Simulations are performed with σ 2 a = 0.2.• HW model: There are K = H + 1 classes where H is the number of bits used for the binary decomposition of the sensitive variable X.The value of K is significantly smaller compared to the previous cases, the Central Limit Theorem (CLT) may thus lead to a inaccurate model.N k = 4 10 6 P (k) traces are generated for each class, where P (k) = 2 −H C k H and C k H is the binomial coefficient.Unless mentioned, all simulations are conducted with σ 2 = 10.

Stochastic linear leakage model
The probability density function for Ŝ and B is assessed using sample sets generated with the byte value linear model.This is then compared to the Gaussian approximation described in Eq. 5 and Eq. 10.The outcomes are illustrated in Figure 1, demonstrating a very good matching that validates the derived models.The p.d.f. of Ẑ is assessed using samples obtained from the byte value linear model.This evaluation is then compared to the probability density function provided by Equation 2 (referred to as "Model"), as well as its Gaussian approximation with mean and variance described by Equation 11 (referred to as "pdf-Gaussian approximation").Results are presented in Figure 2 for a balanced classes configuration (N k = 1000 ∀k) and µ Ŝ /µ B = 0.1.We observe a good matching between the simulations and the two p.d.f.models.This confirms the validity of the Gaussian approximation for low SNR.

HW leakage model
Figure 3 displays the p.d.f. of Ẑ for H = 8 and 32 when the HW leakage is present.The model accurately represents the true p.d.f.even for a small number of classes (K = 9).The Gaussian approximation slightly overestimates the p.d.f., but remains close to the simulation results.As such, it is also suitable for a HW-based classification.Figure 4 shows the p.d.f. of Ẑ for H = 8 in the absence of any leakage (f k = µ f ).The p.d.f. of the model given by Eq. 6 matches very well with simulation results.

Validation with experimental results
In this section, we experimentally validate the p.d.f.models derived under hypotheses H 0 and H 1 .The experiments use the ASCADv2 database [BPS + 20].More precisely, we use the "ascadv2-extracted" database available for download.The SNR measured with the byte value model for the first output of the masked sbox is presented in Figure 5. 500000 traces are used for the computation of the SNR.A distinguishable peak is observed at time index 5222.We assigned this time sample to hypothesis H 1 and the time samples belonging to the interval [0, 4000] to hypothesis H 0 .Figure 6 displays the p.d.f. of Ẑ for the byte value and HW models computed at time index 1955.The Gaussian approximation outlined in Eq. 11 is precise for the byte value model, whereas the p.d.f.provided by Eq. 6 closely matches the simulation results for the HW model.The p.d.f. of Ẑ for the byte value is evaluated at the time index 5222.Since the true value of θ is unknown, we assume that the model of Eq. 11 is valid and estimate it by    θ = E[Z] − K/N .This value is then inserted in Eq. 12 to compute µ Ẑ and σ 2 Ẑ .Figure 7 displays the p.d.f. of Ẑ for the byte value.The Gaussian approximation outlined in Eq. 11 is precise for the byte value model.

Detection threshold and sampling complexity
The detector computes a decision variable D (e.g.Ẑ for SNR and correlation for the ρ-test) and must decide between two hypotheses: H 0 or H 1 .When p.d.f.p(D|H 0 ) and p(D|H 1 ) are known, the Likelihood Ratio Test (LRT) [Poo94] is a conventional decision-making rule.It based on the ratio p(D|H 0 ) p(D|H 1 ) .However, in our situation, this is not applicable because the LRT depends on the unknown variable we want to detect (e.g.θ for D = Ẑ).An alternative solution is to design a detector that rejects the hypothesis H 0 .This is the strategy already applied by T-test and χ 2 -test.A threshold γ is pre-determined and the detector decides for hypothesis H 0 when D < γ and H 1 otherwise.The performance are defined by the false positive (α) and false negative (β) rates [Poo94]:

Byte value model
In the followings, we will denote Ẑ0 (resp.Ẑ1 ) the value of Ẑ under hypothesis H 0 (resp.H 1 ).According to Section 3, Ẑ0 and Ẑ1 have Gaussian p.d.f. for small SNR.Using the Gaussian approximation for Ẑ0 : where Q(x) is defined by Eq. 3. Similarly, From Eq. 11 and 5, we have: . γ is set to maintain a false alarm rate smaller than α.Using Eq. 15 with µ Ẑ0 and σ Ẑ0 given by Eq. 11 we obtain: Note that γ is independent from the noise variance.We will now derive the relationship between the number of trace N , α, β and the SNR value θ when all classes have the same size (N k = N/K).Inserting Eq. 17 in the definition of β in Eq. 16 gives: Let us define the following notations: After insertion of the expression of σ Z1 and σ Z0 , we obtain a polynomial on the variable x = N θ: The root is eventually: The product N θ is thus constant when a, b and K are fixed.

HW model
For an HW leakage model, the p.d.f. of Ẑ0 is not Gaussian anymore.α is defined by: where p S (x) is given by Eq. 6.
Assuming that the assumption µ B ≈ σ 2 holds, we obtain: where g(s, x) = ∞ x t s−1 e −t dt is the upper incomplete gamma function.From Eq. 7, we observe that λ is proportional to σ 2 .Hence σ 2 /λ = N/ (2 min k (1 − P (k))) does not depend on the the noise variance.Consequently, for a predefined α, the corresponding detection threshold γ is independent from σ 2 and can be pre-determined.Moreover, the variable σ 2 γ/λ = γN/ (2 min k (1 − P (k))) is constant for a fixed α.Consequently, the product γN = c H is constant for a pre-determined α and a fixed value of H. Table 1 gives the values of c H evaluated by simulations for different values of α and H = 8.The value γ = c H /N is eventually inserted in Eq. 16 which results in a polynomial on the variable x = N θ: The root is eventually:

ρ test
Similar to the previous section, we compute the sampling complexity of the ρ-test taking into account the false negative and positive rates.To do so, we reuse some results and properties already presented in [Man04].Let ρ be Peason's correlation coefficient computed with the acquisition samples and µ ρ be its mean.The Fisher's Z transformation of ρ follows a Gaussian distribution: . For low SNR and correlation values, this leads to the following approximation: From the Gaussian approximation of Q and for low correlation values, we have: Let us define the following notations: From the definition of α and β, we have γ = a 1 / √ N − 3 and Finally, an approximation of N is found:

T-test
When the number of traces is large, the T-test follows approximately a Gaussian distribution N (0, 1) [DS16].Consequently, the detection threshold is set as follows: When α = 10 −6 , one finds the threshold value γ = 4.9 which is often found in the literature [CdMG + 13].
The sampling complexity of the T-test has been derived and studied in [WO19].Using an approximation of the T-test variable as a Gaussian distribution, N is derived when the number of traces is equally partitioned between the two classes: where b 2 = Q −1 (β).

Sampling complexity
The impact of the trace length L on the leakage detection is analyzed in [DZD + 18] and [WO19].In a multivariate setting, the overall false positive rate α T is related to the univariate counterpart α by α T = 1 − (1 − α) L if the L detection tests are independent.The value assigned to α is thus set in order to limit the false positives over the entire trace: The value of β is set similarly.There is however a difference due to the small number of leakage points in the trace which reduces the practical value of L.
The sampling complexity of T-test and ρ-test are compared in [DS16] for the HW model under hypothesis H 1 .For a fixed SNR, the average value of the decision metric is computed as a function of the number of traces.The authors concludes that the T-test needs less traces than ρ-test to detect leakage.However, it does not provide information regarding the exploitable leakage samples.In this section, we extend this work by adding the SNR method into the comparison and also consider a stochastic leakage model in addition to the HW model.The comparison is also implemented differently.The detection threshold γ is set for a false positive rate α = 10 −6 .Then, we evaluate the number of traces required to provide a false negative rate below a pre-determined value β = 10 −3 .
The sampling complexity of the T-test, χ 2 -test, ρ-test and SNR is now evaluated.For the T-test and χ 2 -test, the evaluation is made with the fixed-versus-random option.The fixed class is built with byte X = 0 and the number of traces is equally partitioned between the two classes.The χ 2 -test is implemented in the same way as it is described in the original paper [MRSS18].Since the number of column of the contingency matrix may vary from one draw to another, the detection threshold is not constant.Consequently, the p-value is computed and a false negative is declared if the p-value is larger than 10 −6 .The estimation of the stochastic leakage model described in Section 4.1 is also considered for the ρ-test.Instead of estimating each coefficient ε i , the leakage f k is directly targeted.In order to optimize the number of traces used to estimate the f k 's and compute the ρ-test value, a cross-validation technique is applied.The number of traces N is split in N c sets.The first (N c − 1)N/N c traces are used to estimate parameters f k .Traces are classified according to the target value (0 to 255) and f k is estimated by averaging the traces belonging to the same class.The N c traces of the last remaining set are used to compute their contribution to the ρ-test value.This operation is then repeated by shifting from one set at a time.In the end, all the traces are used for the estimation and the computation of the ρ-test but with disjoint sets.
The number of traces N evaluated by simulations is compared to the theoretical value predicted by Eq. 18, 21 , 23 or 24. Figure 8 shows the value of N as a function of the true SNR for the T-test, ρ-test, χ 2 -test and SNR method for the stochastic linear leakage model.The T-test is the most efficient technique, followed by the ρ-test when the leakage model is perfectly estimated, the χ 2 -test and the SNR method.The theoretical models are very close to the simulation results and even overlap for the T-test.This validates them.The sampling complexity of the SNR is higher because it requires a minimum amount of traces for each of the 256 classes whereas the T-test considers only two classes.When the leakage model is estimated for the ρ-test , the number of traces increases significantly and deviates from the value predicted by the theoretical model for a perfect estimation.In addition, N decreases when the number of cross-validation sets N c increases.It converges to the same level as the SNR.The χ 2 -test is not the most appropriate solution because it is not informative about the exploitability of the leakage and is less efficient than the T-test for a binary leakage detection test.
Figure 9 shows the value of N as a function of the SNR for the T-test, ρ-test and SNR method for the HW leakage model.Once again, the T-test is the most efficient technique, followed by the ρ-test and the SNR method.The theoretical model for the SNR is accurate for the HW32 leakage model.However it is not tight for the HW8 model and a very small SNR.This is probably due to the Gaussian approximation of Ŝ in Section 3.1.1which is not fully valid for HW8.The number of classes is not large enough to apply the CLT.The error between the model and the simulations may increase with σ 2 , explaining the gap observed at an SNR below −12 dB.
The validity of the theoretical models is then evaluated on real traces from the ASCADv2 data base.We reuse the same set of traces and methodology as in Section 4.2 for the   2. It presents the theoretical value of N for the T-test, SNR (byte value and HW8 models) and ρ-test (byte value and HW8 models), the experimental measure and the error ratio.The value provided for the ρ-test with the byte value model (N c = 6 or 10) as "theory" is a simulation results obtained for θ = 0.044.This value is the experimental SNR measured on the ASCADv2 dataset.This corresponds to a noise variance σ 2 = 45.5.The theoretical sampling complexity is close to the experimental results for the SNR with the byte value model, the T-test and the ρ-test.The error is larger for the SNR with the HW8 model.This may come from the relative inadequacy of the theoretical model for the HW8 leakage, as explained in the paragraph just above.

Recommendations for an evaluator
The results presented in Section 5.4 allow to draw some recommendations for the best usage of each method: • If the evaluator is only interested by a YES/NO answer concerning the presence or absence of leakage, he/she should use the T-test method.
• If the evaluator is interested in the exploitability of the leakage, he/she should use the ρ-test method to test an HW leakage model and the SNR for the byte value model.
• The SNR method is not appropriate to test an HW leakage model.
In addition, for fixed values of α and β, two strategies are offered to the evaluator: • The number of traces is limited to N max : the minimum detectable SNR value is thus θ 0 = x 0 /N max , where x 0 is given by either Eq. 18, 21, 23 or 24 depending on the selected detection method.
• The evaluator is only interested in detecting leakages whose SNR is greater than θ 0 .The minimum number of traces that must be acquired is thus N min = x 0 /θ 0 .
If none of the previous procedure succeeds and the evaluator knows a sensitive variable is processed during the trace acquisition, the remaining solution is to implement a leakage detection test based on MI [MOBW13][CLM20].

Conclusion
The SNR is a widely used metric for assessing an information leakage from a device.We derived the theoretical formulation of its p.d.f. in case leakage is present or not, under a small SNR assumption.Our study covers the byte value and the HW leakage models.These p.d.f.formulations were validated through simulations and experiments on a set of traces taken from the ASCADv2 dataset.They are used to set a detection threshold that rejects false positives from the SNR measurements.These p.d.f.formulations are also used to derive the theoretical number of traces that are required to remain below pre-determined false negative and false positive rates.The sampling complexity of the T-test, ρ-test and SNR has eventually been defined and compared for the byte value and HW leakage model.The T-test is the most efficient technique for the two leakage models.The ρ-test is more efficient than the SNR method under the HW leakage model but the two techniques perform equally under the byte value model.In fact, the leakage model must be estimated from the traces in order to implement the ρ-test with the byte value model.Unfortunately, the T-test does not provide any information about the detected leakage exploitability.Consequently, the SNR method is the most appropriate method when the evaluator is interested in the exploitability of the leakage and has no prior information about the leakage model.In that case, the SNR shall be computed with the byte value model.When the evaluator wants to test the validity of the HW leakage model, the ρ-test is the most appropriate solution.
Figure 2 also illustrates the p.d.f. of Z and the distribution proportional to the F-score under hypothesis H 0 (f k = µ f ) for the byte value model.The Gaussian approximation closely approximates both the F-score distribution and the simulation results.

Figure 5 :
Figure 5: SNR measured with the byte value model.

Figure 7 :
Figure 7: SNR measured with the byte value model -Hypothesis H 1 .
3. Third, we compare the T-test, χ 2 -test, ρ-test and SNR-based detection method in terms of sampling complexity by means of simulations and experimental results.

Table 1 :
Value of c H for different false positive rates (H = 8).