TY - JOUR AU - Sreedhar, Kavya AU - Horowitz, Mark AU - Torng, Christopher PY - 2022/08/31 Y2 - 2024/03/29 TI - A Fast Large-Integer Extended GCD Algorithm and Hardware Design for Verifiable Delay Functions and Modular Inversion JF - IACR Transactions on Cryptographic Hardware and Embedded Systems JA - TCHES VL - 2022 IS - 4 SE - Articles DO - 10.46586/tches.v2022.i4.163-187 UR - https://tches.iacr.org/index.php/TCHES/article/view/9817 SP - 163-187 AB - <p>The extended GCD (XGCD) calculation, which computes Bézout coefficients <em>b<sub>a</sub></em>, <em>b<sub>b</sub></em> such that <em>b<sub>a</sub></em> ∗ <em>a<sub>0</sub></em> + <em>b<sub>b</sub></em> ∗ <em>b<sub>0</sub></em> = GCD(<em>a<sub>0</sub></em>, <em>b<sub>0</sub></em>), is a critical operation in many cryptographic applications. In particular, large-integer XGCD is computationally dominant for two applications of increasing interest: verifiable delay functions that square binary quadratic forms within a class group and constant-time modular inversion for elliptic curve cryptography. Most prior work has focused on fast software implementations. The few works investigating hardware acceleration build on variants of Euclid’s division-based algorithm, following the approach used in optimized software. We show that adopting variants of Stein’s subtraction-based algorithm instead leads to significantly faster hardware. We quantify this advantage by performing a large-integer XGCD accelerator design space exploration comparing Euclid- and Stein-based algorithms for various application requirements. This exploration leads us to an XGCD hardware accelerator that is flexible and efficient, supports fast average and constant-time evaluation, and is easily extensible for polynomial GCD. Our 16nm ASIC design calculates 1024-bit XGCD in 294ns (8x faster than the state-of-the-art ASIC) and constant-time 255-bit XGCD for inverses in the field of integers modulo the prime 2<sup>255</sup>−19 in 85ns (31× faster than state-of-the-art software). We believe our design is the first high-performance ASIC for the XGCD computation that is also capable of constant-time evaluation. Our work is publicly available at <a href="https://github.com/kavyasreedhar/sreedhar-xgcd-hardware-ches2022">https://github.com/kavyasreedhar/sreedhar-xgcd-hardware-ches2022</a>.</p> ER -