TY - JOUR
AU - Masure, Loïc
AU - Dumas, Cécile
AU - Prouff, Emmanuel
PY - 2019/11/19
Y2 - 2024/06/17
TI - A Comprehensive Study of Deep Learning for Side-Channel Analysis
JF - IACR Transactions on Cryptographic Hardware and Embedded Systems
JA - TCHES
VL - 2020
IS - 1
SE - Articles
DO - 10.13154/tches.v2020.i1.348-375
UR - https://tches.iacr.org/index.php/TCHES/article/view/8402
SP - 348-375
AB - <p>Recently, several studies have been published on the application of deep learning to enhance Side-Channel Attacks (SCA). These seminal works have practically validated the soundness of the approach, especially against implementations protected by masking or by jittering. Concurrently, important open issues have emerged. Among them, the relevance of machine (and thereby deep) learning based SCA has been questioned in several papers based on the lack of relation between the <em>accuracy</em>, a typical performance metric used in machine learning, and common SCA metrics like the <em>Guessing entropy </em>or the <em>key-discrimination success rate</em>. Also, the impact of the classical side-channel counter-measures on the efficiency of deep learning has been questioned, in particular by the semi-conductor industry. Both questions enlighten the importance of studying the theoretical soundness of deep learning in the context of side-channel and of developing means to quantify its efficiency, especially with respect to the optimality bounds published so far in the literature for side-channel leakage exploitation. The first main contribution of this paper directly concerns the latter point. It is indeed proved that minimizing the <em>Negative Log Likelihood </em>(NLL for short) loss function during the training of deep neural networks is actually asymptotically equivalent to maximizing the <em>Perceived Information </em>introduced by Renauld <em>et al. </em>at EUROCRYPT 2011 as a lower bound of the <em>Mutual Information </em>between the leakage and the target secret. Hence, such a training can be considered as an efficient and effective estimation of the PI, and thereby of the MI (known to be complex to accurately estimate in the context of secure implementations). As a second direct consequence of our main contribution, it is argued that, in a side-channel exploitation context, choosing the NLL loss function to drive the training is sound from an information theory point of view. As a third contribution, classical counter-measures like Boolean masking or execution flow shuffling, initially dedicated to classical SCA, are proved to stay sound against deep Learning based attacks.</p>
ER -