High-Performance ECC Scalar Multiplication Architecture Based on Comb Method and Low-Latency Window Recoding Algorithm

J Zhang, Z Chen, M Ma, R Jiang, H Li… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
J Zhang, Z Chen, M Ma, R Jiang, H Li, W Wang
IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2023ieeexplore.ieee.org
Elliptic curve scalar multiplication (ECSM) is the essential operation in elliptic curve
cryptography (ECC) for achieving high performance and security. We introduce a novel high-
performance ECSM architecture over binary fields to meet the growing demand for
performance and security. A low-latency window (LLW) recoding algorithm for hardware
implementation is proposed to enhance the resistance toward side-channel attacks (SCAs).
Based on the LLW algorithm, we propose an enhanced comb method for ECSM with a …
Elliptic curve scalar multiplication (ECSM) is the essential operation in elliptic curve cryptography (ECC) for achieving high performance and security. We introduce a novel high-performance ECSM architecture over binary fields to meet the growing demand for performance and security. A low-latency window (LLW) recoding algorithm for hardware implementation is proposed to enhance the resistance toward side-channel attacks (SCAs). Based on the LLW algorithm, we propose an enhanced comb method for ECSM with a unified point addition (PA) and point doubling (PD) pattern. The theoretical analysis demonstrates that the enhanced comb method with strikes the balance of computation burden for both extreme cases. To achieve short clock cycle latency and high frequency, the data dependency of ECSM is thoroughly analyzed, and we explore a timing schedule with one two-stage pipelined Karatsuba multiplier accumulator (MAC). The datapath of the proposed architecture is well-designed, ensuring that the critical path (CP) only contains minimal logic primitives apart from the MAC. Besides, the ideal placement of pipeline stages for MAC is illustrated. The proposed architecture has been implemented on Xilinx Virtex-7 series field-programmable gate arrays (FPGAs) and performs ECSM in 2.51, 4.93, and with 3422, 7983, and 20158 slices over , , and , respectively. Implementation results reveal that our design shows 53.60%, 39.36%, and 32.64% performance improvement over the existing state-of-the-art works, respectively.
ieeexplore.ieee.org
以上显示的是最相近的搜索结果。 查看全部搜索结果