# UCLouvain



Reducing a Masked Implementation's Effective Security Order with Setup Manipulations And an Explanation Based on Externally-Amplified Couplings

<u>Itamar Levi</u>, Davide Bellizia and François-Xavier Standaert Aug. 2018

### Motivation

Masking - a well understood SCA countermeasure

- Split sensitive variables into *d* shares.
- Compute on those shares only.

Independence assumption – the shares induced leakages are independent, and

• they are merged linearly...

It forces the adversary to estimate a higher-order statistical moment of the leakage

 data complexity grows exponentially with *d* -> amplifies the noise in the leakages

The lowest key-dependent stat. moment - security order

Concretely though, it is hard to achieve it...



amp.

tion

cases

Concl.

### Motivation

#### Well understood non-idealities:

- Glitches
- Memory transitions

Can recombine leakages (nonlinear manner)

Can be kept under control at design (synthesis) time:

- Threshold Implementations (TIs) non-completeness [NRS11]
- Transition-based leakages can be mitigated by doubling the number of shares [BGG+14] / adding registers or refreshing [CGP+12]

=> *logical recombination*, since they can be formulated as logical conditions which can then be verified and prevented [FGP+18] => recalling yesterday's *Session 6*.

#### Motiva couplin Ext.gs Ext.amp. Testcases Concl.



## Motivation

#### Well understood physical defaults:

- Glitches
- Memory transitions

Can recombine leakages (nonlinear manner)

Can be kept under control at design (synthesis) time:

- Threshold Implementations (TIs) non-completeness [NRS11]
- Transition-based leakages can be mitigated by doubling the number of shares [BGG+14] / adding registers or refreshing [CGP+12]

=> *logical recombinations*, since they can be formulated as logical conditions which can then be verified and prevented [FGP+18].

#### This talk: another physical default, *couplings*, recently reported by De Cnudde et al.

• Electrical dependency between the shares (e.g. capacitive, resistive)





3



What are couplings What do we know of them How to **externally** amplify them Different test cases (SW/HW)

Moving from detection to exploitation

Discussion/ how to advance





# What are couplings

- Electrical
  - Capacitive
  - Resistive
  - Inductive (less local)
  - Memri/Resistive-RAM (consider new devices M/RRAM etc.)
- Affected by
  - Capacitive proximity
  - Resistive power-grid / proximity
  - All Technology params
  - Periodicity (L, RC)
- What can we control?
  - Depend on the device (SW/FPGA/ASIC...) but,
  - Mainly on the power-grid and proximity





5



In theory

 $\mathbf{X}_2$ 

In practice: not so linear and not so nice...

### What do we know of them In the context of SCA

- <u>De Cnudde et al., [CBG+17, CEM18]</u> put forward that even when implemented correctly (glitches, transitions), masking can suffer from re-combinations.
  - Tweaking shares proximity (placement and routing)
  - Iterating/parallelize the shares to increase their signal/re-combination
- Typically not something an <u>adversary</u> can do .. (designers will aim to prevent)
- Practically:
  - The amplitude of these lower-order leakages was usually lower than the one of the d<sup>th</sup> order leakages [CBG+17]
  - Were evaluated by detection-tests (T-tests)
- Is there a real threat without <u>internal</u>-amplification?



couplin

gs

 $\mathbf{X}_{1}$ 

Ext.-

amp.

Test-

cases

Concl.

Motiva

tion

• A simple example (<u>resistive</u> couplings):











Motiva

tion

couplin

gs

Ext.-

amp.

Test-

cases

Concl.

- A simple example:
  - Devices in linear mode..

$$I' = \alpha_1 I_{Sh1} + \alpha_2 I_{Sh2} - \beta (I_{Sh1} \cdot I_{Sh2})$$

- First order approx.
- No capacitive effects



$$\alpha_{i} = \frac{1}{\left(1 + \frac{2R_{ext}}{R_{on_{i}}}\right)} \approx 1$$

$$\beta = \frac{R_{ext}}{V_{DD,ext}} \left[\frac{R_{on1}}{2R_{ext} + R_{on1}} + \frac{R_{on2}}{2R_{ext} + R_{on2}}\right]_{R_{ext} \ll R_{on1}, R_{on2}} \cong \frac{2R_{ext}}{V_{DD,ext}}$$

Motiva

tion

couplin

gs

Ext.-

amp.

Test-

cases

Concl.

- But, lowering V<sub>DD</sub> has a *negative effect*...
  - Reduces the signal (typically, SNR  $\downarrow$ )
  - At some point the device will not work

- A simple example:
  - Devices in linear mode..

$$I' = \alpha_1 I_{Sh1} + \alpha_2 I_{Sh2} - \beta (I_{Sh1} \cdot I_{Sh2})$$

- First order
- No capacitive effects



$$\alpha_{i} = \frac{1}{\left(1 + \frac{2R_{ext}}{R_{on_{i}}}\right)} \approx 1$$

$$\beta = \frac{R_{ext}}{V_{DD,ext}} \left[\frac{R_{on1}}{2R_{ext} + R_{on1}} + \frac{R_{on2}}{2R_{ext} + R_{on2}}\right]_{R_{ext} \ll R_{on1}, R_{on2}} \cong \frac{2R_{ext}}{V_{DD,ext}}$$

couplin

gs

Ext.-

amp.

Test-

cases

Concl.

Motiva

tion

- But, lowering V<sub>DD</sub> has a *negative effect*...
  - Reduces the signal (typically, SNR  $\downarrow$ )
  - At some point the device will not work
- So, increasing *R*<sub>ext</sub> then,
  - Too much- the device will not work
  - We might need to simult. Increase  $V_{DD}$
  - With  $R_{ext}$   $\uparrow$  the noise increase

$$I' = \alpha_1 I_{Sh1} + \alpha_2 I_{Sh2} - \beta (I_{Sh1} \cdot I_{Sh2})$$
$$\beta \cong \frac{2R_{ext}}{V_{DD,ext}}$$

- But, lowering V<sub>DD</sub> has a *negative effect...* 
  - Reduces the signal (typically, SNR  $\downarrow$ )
  - At some point the device will not work
- So, increasing R<sub>ext</sub> then,
  - Too much- the device will not work
  - We might need to simult. Increase V<sub>DD</sub>
  - With  $R_{ext}$   $\uparrow$  the noise increase
- No trivial answer to what is the worst-case scenario,
  - Depends on the device, the noise, power regulator (if any).

Motiva

tion

couplin

gs

Ext.-

amp.

Test-

cases

Concl.

• The exploration space for a certification lab is huge ...

 $I'_{\text{supply}} \approx \sum_{i} I_{i} - \frac{R_{\text{ext}}}{V_{DD}\_ext} \cdot \sum_{i} \sum_{j} I_{j}I'_{i} + \dots \underbrace{I_{i}gher\_powers}_{higher\_powers}$ The simplified model can be generalized (*d*):

- But,
  - Expected: leakage at all stat.-moments/powers (solve MAXWELL ...) → modeling is hard

 $2^{nd}$ -order

Motiva

tion

----> *d* ----> *d*/2

couplin

gs

1<sup>st</sup> order

Ext.-

amp.

Test-

cases

Concl.

???

- So our goals were:
  - To examine weather setup-manipulations can reduce the *effectively security-order*
  - Our explanation is based on these *externally amplified couplings*
- The approach we use:
  - To try and falsify
  - To understand if the amplitudes of lower orders leakages can be made significant with amplification

### How to evaluate?



Moving on from a:

- "detection" based approach (T-test)
  - $T_{value} = (\mu_{Set_0} \mu_{Set_1}) / \sqrt{\sigma_{Set_0}^2 / |Set_0| + \sigma_{Set_1}^2 / |Set_1|}.$ Hard to connect with actual SR

- $\tilde{k} = \arg \max \hat{\rho}(\hat{M}^d_{x,k*}, (l^t_{x,k})^d)$  to actual exploitation (MCP-DPA):
  - Profiling moments (d=2 use CM, d>2 use SM..)
    - Gives us the ability to check the contribution of different statistical orders
  - The asymptotic value gives an estimation of the informativeness /SR /#samples required [MS16]

### Test-cases



- We have investigated two designs / platforms:
  - HW: AES128 (8bit) 2-shares implementation adopting Domain Oriented Masking [GMK17] on Spartan6 LX75 FPGA (Sakura G board)
  - SW: 2-shares AES SBOX with the bitslice secure scheme in [JS17] implementation following Barthe et-al. [BDF+17] on an Atmel SAM4C16 (ARM Cortex-M4)
- Picoscope 5244B (quant. 12bit) +
- HW

- Sakura G's preamp
- low-noise res. (0 to  $20\Omega$ ).
- $f_{clk} = 4MHz$
- S<sub>R</sub> = 250MS/s (<- enough)
- *V<sub>DD</sub>* from 1 to 1.45 V



• Lecroy WaveRunner (12bit),

- SW
- <u>Tektronix CT1 + res. (1 Ω to 39Ω)</u>, benchtop PSU
- f<sub>clk</sub> = 100MHz
- $S_R = 1GS/s$
- *V<sub>DD</sub>* from 1 to 1.55 V
- Removed 2.2, 0.1 μ*F* Caps...
- Commercial off-the-shelf devices yet to be explored on ASICs/ specialized 13 devices





• HW – Sbox-parallel design



#### SW

- 0: Input: shares of a and b (a, b) and a uniform randomness vector r.
  0: Output: shares x of x, with a ⋅ b = <sup>d</sup>⊕ x<sub>i</sub>.
  1: c<sub>1</sub> = a ⋅ b
  2: c<sub>2</sub> = a ⋅ rot(b, 1)
- 2:  $c_2 = a \cdot rot(b, 1)$ 3:  $c_3 = rot(a, 1) \cdot b$ 4:  $d_1 = c_1 \oplus r$ 5:  $d_2 = d_1 \oplus c_2$ 6:  $d_3 = d_2 \oplus c_3$ 7:  $d_4 = d_3 \oplus rot(r, 1)$ 8:  $x = d_4$ 9: return x
- SW serial  $\rightarrow$  nicer to interprate ...
- Conceptually SW will be more sensitive due to a shared power-grid



#### Software implementation (*u*C) – ARM32 bit (ATMEGA)

#### Model/Simulation

Measurement (uC)





### Motiva couplin gs Ext.- Testamp. Concl.

### Software implementation (*u*C) – ARM32 bit (ATMEGA)

#### Model/Simulation

Measurement (uC)



Figure 2:  $f(l^{simu.}|s)$ : (a)  $\beta = 0$  (b)  $\beta = 0.5$ 

Figure 8: f(l|s),  $1 \cdot 10^6$  traces: (a)  $R_{ext}=0\Omega$  (b)  $R_{ext}=20\Omega$ 

### A T-test sanity check..

- \* DoM AES (Hannes et-al. [GNK17])
- \* Hardware FPGA (Spartan 6) scenario

  - "detection" based approach (T-test)  $T_{value} = (\mu_{Set_0} \mu_{Set_1}) / \sqrt{\sigma_{Set_0}^2 / |Set_0|} + \sigma_{Set_1}^2 / |Set_1|.$
  - Only one voltage case (nominal), R changing.





- \* DoM AES (Hannes et-al. [GNK17])
- \* Hardware FPGA (Spartan 6) scenario
  - Exploitation (MCP-DPA):

- Inherent leakage →
   ~x10 amplification ...
- No initial leakage → ~x10 amplification and generation



 $\tilde{k} = \arg \max \hat{\rho}(\hat{M}^d_{x,k*}, (l^t_{x,k})^d)$ 

- \* DoM AES (Hannes et-al. [GNK17])
- \* Hardware FPGA (Spartan 6) scenario Moving on from a:
  - "detection" based approach (T-test)

 $T_{value} = (\mu_{Set_0} - \mu_{Set_1}) / \sqrt{\sigma_{Set_0}^2 / |Set_0| + \sigma_{Set_1}^2 / |Set_1|}.$ 

• to actual exploitation (MCP-DPA):

$$\tilde{k} = \underset{k*}{\operatorname{arg\,max}} \quad \hat{\rho}(\hat{M}^{d}_{x,k*}, \ \left(l^{t}_{x,k}\right)^{d})$$



couplin

gs

Ext.-

amp.

Test-

cases

Concl.

Motiva

tion

- \* Bitslice Barthe et-al. [BDF+17]
- \* Software uC scenario (ARM32 in ATMEGA)
  - SW Similar results
  - Quite alarming amplification.
  - From externally !

### No. Traces for attack/profiling = 700k/10M







#### couplin Ext.-Motiva Test-Open Challenge - Scaling (d) Concl. tion gs amp. cases 3-shares HW model, #samples=1e7, $\sigma_n$ =0.1 1.2 p=0, β=0 3-shares p=1, β=0 1 • How would it scale ? p=0, β=0.5 p=1, β=0.5 0.8 $p=0, \beta=1$ • Taking only some dominant c=1, β=1 \_\_\_\_0.6 factors b=0, β=2 p=1, β=2 0.4 $I_{\text{supply}}^{\prime} \approx \underbrace{\sum_{i} I_{i}}_{2^{nd}-order} - \frac{R_{\text{ext}}}{V_{DD}\_ext} \cdot \underbrace{1}_{2^{nd}-order}$ $\sum I_j I'_i + \dots$ 0.2 higher\_powers 1st order 0 -2 -1 0 3 5 4 Norm. Leakage (I) 4-shares HW model, #samples=1e7, $\sigma_{\rm n}$ =0.1 p=0, β=0 A-shares o=1. β=0 0.8 c=0. β=0.5 $p=1, \beta=0.5$ o=0. *β*=1 0.6 $f_{||p}$ $p=1. \beta=1$ p=0, β=2 0.4 p=1, β=2 0.2 21 0 -2 2 3 5 6 -1 0 4 Norm Leakage (I)

#### couplin Ext.-Motiva Test-Open Challenge - Scaling (d) Concl. tion gs amp. cases 3-shares HW model, #samples=1e7, $\sigma_n$ =0.1 1.2 p=0, β=0 3-shares p=1, β=0 1 • How would it scale ? p=0, *β*=0.5 p=1, β=0.5 0.8 p=0, *β*=1 • Taking only some dominant p=1, *β*=1 \_\_\_\_0.6 o=0. β=2 factors p=1, β=2 $I'_{\text{supply}} \approx \underbrace{\sum_{i} I_{i}}_{2^{nd}-order} - \frac{R_{\text{ext}}}{V_{DD}\_ext} \cdot \underbrace{\sum_{i} \sum_{j} I_{j}I'_{i}}_{1st \ order} + \dots \underbrace{higher\_powers}_{higher\_powers}.$ 0.4 0.2 0 -2 -1 0 3 5 4 Norm. Leakage (I) 4-shares HW model, #samples=1e7, $\sigma_{\rm n}$ =0.1 p=0, β=0 A-shares p=1, β=0 0.8 p=0, β=0.5 p=1, β=0.5 p=0. *β*=1 0.6 p=1, β=1 $f_{||p}$ p=0, β=2 0.4 p=1, β=2 0.2 22 0 -2 2 3 6 -1 0 4 5 Norm, Leakage (I)

## Open Challenge - Scaling (d)



couplin

gs

Motiva

tion

Ext.-

amp.

Test-

cases

Concl.

# Open Challenge - Scaling (d)

3-shares HW model, #samples=1e7,  $\sigma_n$ =0.1 1.2 p=0, β=0 3-shares  $p=1, \beta=0$ • How would it scale ? p=0, *β*=0.5 p=1, *β*=0.5 0.8 Taking only some dominant p=0, B=1  $p=1. \beta=1$ J\_\_\_\_0.6 factors p=0, β=2 p=1, β=2 0.4  $\frac{R_{\rm ext}}{V_{DD\_ext}}$  $I_{
m supply}' pprox$  $I_{j}I'_{i} + ...$ 0.2 higher powers 2<sup>nd</sup>-order 1st order 0 -2 0 Norm. Leakage (I) 3 4 5 -1 • In practice, highly design 4-shares HW model, #samples=1e7,  $\sigma_n$ =0.1 dependent.  $p=0. \beta=0$ A-shares c=1. β=0 • The question is the respective 0.8 p=0, β=0.5 p=1, β=0.5 informativeness of these lower p=0,  $\beta$ =1 0.6 f IIp o=1. β=1 orders moments? p=0, β=2 0.4 p=1, β=2 or how concrete is their 0.2 amplification... 24 0 2 3 -2 -1 0 5 6

couplin

gs

Ext.-

amp.

Test-

cases

Concl.

Motiva

tion

Norm, Leakage (I)





Setup manipulations (or externally amplifies couplings)

• Can have a significant impact on the security order, not only on the noise level.

We demonstrate that for off-the-shelf devices it actually happens

#### **Open questions:**

- How would the security order reduction *scale* with *d* ?
- How is it possible to build realistic "Extended-Probes" / realistic models for such adversaries ?
- Would we see the same results for ASICs / specialized devices (not off-the-shelf)

Existing design-phase tools will not do .. (e.g. *MaskVerif*/ ELMO - *logical tools*)

Thank you for your attention!