Evaluations on deep neural networks training using posit number system J Lu, C Fang, M Xu, J Lin, Z Wang IEEE Transactions on Computers 70 (2), 174-187, 2020 | 67 | 2020 |
E-LSTM: An efficient hardware architecture for long short-term memory M Wang, Z Wang, J Lu, J Lin, Z Wang IEEE Journal on Emerging and Selected Topics in Circuits and Systems 9 (2 …, 2019 | 58 | 2019 |
Training deep neural networks using posit number system J Lu, S Lu, Z Wang, C Fang, J Lin, Z Wang, L Du 2019 32nd IEEE International System-on-Chip Conference (SOCC), 62-67, 2019 | 24 | 2019 |
ETA: An efficient training accelerator for DNNs based on hardware-algorithm co-optimization J Lu, C Ni, Z Wang IEEE Transactions on Neural Networks and Learning Systems 34 (10), 7660-7674, 2022 | 17 | 2022 |
THETA: A high-efficiency training accelerator for DNNs with triple-side sparsity exploration J Lu, J Huang, Z Wang IEEE Transactions on Very Large Scale Integration (VLSI) Systems 30 (8 …, 2022 | 8 | 2022 |
A reconfigurable DNN training accelerator on FPGA J Lu, J Lin, Z Wang 2020 IEEE Workshop on Signal Processing Systems (SiPS), 1-6, 2020 | 7 | 2020 |
A hardware-oriented and memory-efficient method for CTC decoding S Lu, J Lu, J Lin, Z Wang IEEE Access 7, 120681-120694, 2019 | 6 | 2019 |
LBFP: Logarithmic block floating point arithmetic for deep neural networks C Ni, J Lu, J Lin, Z Wang 2020 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS), 201-204, 2020 | 5 | 2020 |
An FPGA-based reconfigurable accelerator for low-bit DNN training H Shao, J Lu, J Lin, Z Wang 2021 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), 254-259, 2021 | 4 | 2021 |
A reconfigurable accelerator for generative adversarial network training based on FPGA T Yin, W Mao, J Lu, Z Wang 2021 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), 144-149, 2021 | 3 | 2021 |
An efficient CNN training accelerator leveraging transposable block sparsity M Xu, J Lu, Z Wang, J Lin 2022 IEEE 4th International Conference on Artificial Intelligence Circuits …, 2022 | 2 | 2022 |
WinTA: An Efficient Reconfigurable CNN Training Accelerator With Decomposition Winograd J Lu, H Wang, J Lin, Z Wang IEEE Transactions on Circuits and Systems I: Regular Papers, 2023 | 1 | 2023 |
An Efficient Training Accelerator for Transformers With Hardware-Algorithm Co-Optimization H Shao, J Lu, M Wang, Z Wang IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2023 | 1 | 2023 |
An FPGA-based reconfigurable CNN training accelerator using decomposable Winograd H Wang, J Lu, J Lin, Z Wang 2023 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), 1-6, 2023 | 1 | 2023 |
An efficient hardware architecture for DNN training by exploiting triple sparsity J Huang, J Lu, Z Wang 2022 IEEE International Symposium on Circuits and Systems (ISCAS), 2802-2805, 2022 | 1 | 2022 |
A Low-Latency and Low-Complexity Hardware Architecture for CTC Beam Search Decoding S Lu, J Lu, J Lin, Z Wang, L Du 2019 IEEE International Workshop on Signal Processing Systems (SiPS), 352-357, 2019 | 1 | 2019 |