Generic Low-Latency Masking in Hardware
In this work, we introduce a generalized concept for low-latency masking that is applicable to any implementation and protection order, and (in its most extreme form) does not require on-the-fly randomness. The main idea of our approach is to avoid collisions of shared variables in nonlinear circuit parts and to skip the share compression. We show the feasibility of our approach on a full implementation of a one-round unrolled Ascon variant and on an AES S-box case study. Additionally, we discuss possible trade-offs to make our approach interesting for practical implementations. As a result, we obtain a first-order masked AES S-box that is calculated in a single clock cycle with rather high implementation costs (60.7 kGE), and a two-cycle variant with much less implementation costs (6.7 kGE). The side-channel resistance of our Ascon S-box designs up to order three are then verified using the formal analysis tool of [BGI+18]. Furthermore, we introduce a taint checking based verification approach that works specifically for our low-latency approach and allows us to verify large circuits like our low-latency AES S-box design in reasonable time.