International Association for Cryptologic Research

International Association
for Cryptologic Research

Transactions on Cryptographic Hardware and Embedded Systems, Volume 2021

Polynomial Multiplication in NTRU Prime:

Comparison of Optimization Strategies on Cortex-M4


Erdem Alkim
Ondokuz Mayıs University, Samsun, Turkey; Fraunhofer SIT, Darmstadt, Germany

Dean Yun-Li Cheng
Academia Sinica, Taipei, Taiwan; National Taiwan University, Taipei, Taiwan

Chi-Ming Marvin Chung
Academia Sinica, Taipei, Taiwan

Hülya Evkan
Fraunhofer SIT, Darmstadt, Germany

Leo Wei-Lun Huang
Academia Sinica, Taipei, Taiwan

Vincent Hwang
Academia Sinica, Taipei, Taiwan; National Taiwan University, Taipei, Taiwan

Ching-Lin Trista Li
Academia Sinica, Taipei, Taiwan; National Taiwan University, Taipei, Taiwan

Ruben Niederhagen
University of Southern Denmark, Odense, Denmark

Cheng-Jhih Shih
Academia Sinica, Taipei, Taiwan

Julian Wälde
Fraunhofer SIT, Darmstadt, Germany

Bo-Yin Yang
Academia Sinica, Taipei, Taiwan


Keywords: NTT, polynomial multiplication, Cortex-M4, NTRU Prime, PQC


Abstract

This paper proposes two different methods to perform NTT-based polynomial multiplication in polynomial rings that do not naturally support such a multiplication. We demonstrate these methods on the NTRU Prime key-encapsulation mechanism (KEM) proposed by Bernstein, Chuengsatiansup, Lange, and Vredendaal, which uses a polynomial ring that is, by design, not amenable to use with NTT. One of our approaches is using Good’s trick and focuses on speed and supporting more than one parameter set with a single implementation. The other approach is using a mixed radix NTT and focuses on the use of smaller multipliers and less memory. On a ARM Cortex-M4 microcontroller, we show that our three NTT-based implementations, one based on Good’s trick and two mixed radix NTTs, provide between 32% and 17% faster polynomial multiplication. For the parameter-set ntrulpr761, this results in between 16% and 9% faster total operations (sum of key generation, encapsulation, and decapsulation) and requires between 15% and 39% less memory than the current state-of-the-art NTRU Prime implementation on this platform, which is using Toom-Cook-based polynomial multiplication.

Publication

Transactions of Cryptographic Hardware and Embedded Systems, Volume 2021, Issue 1

Paper

Artifact

Artifact number
tches/2021/a4

Artifact published
February 16, 2021

README

ZIP (2 MB)  

View on Github

License


BibTeX How to cite

Alkim, E., Cheng, D. Y.-L., Chung, C.-M. M., Evkan, H., Huang, L. W.-L., Hwang, V., Li, C.-L. T., Niederhagen, R., Shih, C.-J., Wälde, J., & Yang, B.-Y. (2020). Polynomial Multiplication in NTRU Prime: Comparison of Optimization Strategies on Cortex-M4. IACR Transactions on Cryptographic Hardware and Embedded Systems, 2021(1), 217–238. https://doi.org/10.46586/tches.v2021.i1.217-238. Artifact at https://artifacts.iacr.org/tches/2021/a4.