Publications

PQNTRU: Acceleration of NTRU-based Schemes via Customized Post-Quantum Processor

Published in IEEE Transactions on Computers, 2025

Post-quantum cryptography (PQC) has rapidly evolved in response to the emergence of quantum computers, with the US National Institute of Standards and Technology (NIST) selecting four finalist algorithms for PQC standardization in 2022, including the Falcon digital signature scheme. The latest round of digital signature schemes introduced Hawk, both based on the NTRU lattice, offering compact signatures, fast generation, and verification suitable for deployment on resource-constrained Internet-of-Things (IoT) devices. Despite the popularity of Crystal-Dilithium and Crystal-Kyber, research on NTRU-based schemes has been limited due to their complex algorithms and operations. Falcon and Hawk’s performance remains constrained by the lack of parallel execution in crucial operations like the Number Theoretic Transform (NTT) and Fast Fourier Transform (FFT), with data dependency being a significant bottleneck. This paper enhances NTRU-based schemes Falcon and Hawk through hardware/software co-design on a customized Single-Instruction-Multiple-Data (SIMD) processor, proposing new SIMD hardware units and instructions to expedite these schemes along with software optimizations to boost performance. Our NTT optimization includes a novel layer merging technique for SIMD architecture to reduce memory accesses, and the use of modular algorithms (Signed Montgomery and Improved Plantard) targets various modulus data widths to enhance performance. We explore applying layer merging to accelerate fixed-point FFT at the SIMD instruction level and devise a dual-issue parser to streamline assembly code organization to maximize dual-issue utilization. A System-on-chip (SoC) architecture is devised to improve the practical application of the processor in real-world scenarios. Evaluation on 28 nm technology and FPGA platform shows that our design and optimizations can increase the performance of Hawk signature generation and verification by over 7 times. Read more

Recommended citation: Zewen Ye, Junhao Huang, Tianshun Huang, Yudan Bai, Jinze Li, Hao Zhang, Guangyan Li, Donglong Chen, Ray C. C. Cheung, and Kejie Huang. 2025. "PQNTRU: Acceleration of NTRU-based Schemes via Customized Post-Quantum Processor," in IEEE Transactions on Computers, doi: 10.1109/TC.2025.3540647. https://ieeexplore.ieee.org/document/10880097

ProgramGalois: A Programmable Generator of Radix-4 Discrete Galois Transformation Architecture for Lattice-based Cryptography

Published in ACM Trans. Reconfigurable Technol. Syst., 2024

This paper is about the design space exploration on radix-4 DGT framework. Read more

Recommended citation: Guangyan Li, Zewen Ye, Donglong Chen, Wangchen Dai, Gaoyu Mao, Kejie Huang, and Ray C. C. Cheung. 2024. ProgramGalois: A Programmable Generator of Radix-4 Discrete Galois Transformation Architecture for Lattice-based Cryptography. ACM Trans. Reconfigurable Technol. Syst. Just Accepted (August 2024). https://doi.org/10.1145/3689437 https://dl.acm.org/doi/10.1145/3689437

REALISE-IoT: RISC-V Based Efficient and Lightweight Public-key System for IoT Applications

Published in IEEE Internet of Things Journal, 2023

This paper is about the LoRaWAN for IoT applications extended to lightweight public-key infrastructures (including SHA-2, ECDH, EdDSA, and TRNG). Read more

Recommended citation: Gaoyu Mao, Yao Liu, Wangchen Dai, Guangyan Li, Zhewen Zhang, Alan H. F. Lam, and Ray C. C. Cheung, "REALISE-IoT: RISC-V Based Efficient and Lightweight Public-Key System for IoT Applications," in IEEE Internet of Things Journal, doi: 10.1109/JIOT.2023.3296135. http://academicpages.github.io/files/2023-07-10-lora-IoT-J-3.pdf

High-performance and Configurable SW/HW Co-design of Post-quantum Signature CRYSTALS-Dilithium

Published in ACM Trans. Reconfigurable Technol. Syst., 2023

This paper is about HW/SW co-design on CRYSTALS-Dilithium. Read more

Recommended citation: Gaoyu Mao, Donglong Chen, Guangyan Li, Wangchen Dai, Abdurrashid Ibrahim Sanka, Çetin Kaya Koç, and Ray C. C. Cheung. 2023. High-performance and Configurable SW/HW Co-design of Post-quantum Signature CRYSTALS-Dilithium. ACM Trans. Reconfigurable Technol. Syst. 16, 3, Article 44 (September 2023), 28 pages. https://doi.org/10.1145/3569456 http://gavinligy.github.io/GavinLI.github.io/files/2023-06-20-Dilithium-ACM-TRTS-2.pdf

Algorithm-Hardware Co-Design of Split-Radix Discrete Galois Transformation for KyberKEM

Published in IEEE Transactions on Emerging Topics in Computing, 2023

This paper is about split-radix DGT algorithm and KyberKEM architecture. Read more

Recommended citation: G. Li, D. Chen, G. Mao, W. Dai, A. I. Sanka and R. C. C. Cheung, "Algorithm-Hardware Co-Design of Split-Radix Discrete Galois Transformation for KyberKEM," in IEEE Transactions on Emerging Topics in Computing, doi: 10.1109/TETC.2023.3270971. http://gavinligy.github.io/GavinLI.github.io/files/2023-05-02-Kyber-IEEE-TETC-1.pdf