Accelerating sparse dnn models without hardware-support via tile-wise sparsity C Guo, BY Hsueh, J Leng, Y Qiu, Y Guan, Z Wang, X Jia, X Li, M Guo, ... SC20: International Conference for High Performance Computing, Networking …, 2020 | 82 | 2020 |
Transkimmer: Transformer Learns to Layer-wise Skim Y Guan, Z Li, J Leng, Z Lin, M Guo Proceedings of the 60th Annual Meeting of the Association for Computational …, 2022 | 28 | 2022 |
Block-skim: Efficient question answering for transformer Y Guan, Z Li, Z Lin, Y Zhu, J Leng, M Guo Proceedings of the AAAI Conference on Artificial Intelligence 36 (10), 10710 …, 2022 | 20 | 2022 |
How Far Does BERT Look At: Distance-based Clustering and Analysis of BERT s Attention Y Guan, J Leng, C Li, Q Chen, M Guo Proceedings of the 28th International Conference on Computational …, 2020 | 16 | 2020 |
PAME: precision-aware multi-exit DNN serving for reducing latencies of batched inferences S Zhang, W Cui, Q Chen, Z Zhang, Y Guan, J Leng, C Li, M Guo Proceedings of the 36th ACM International Conference on Supercomputing, 1-12, 2022 | 5 | 2022 |
Co-Design of Binary Processing in Memory ReRAM Array and DNN Model Optimization Algorithm Y Guan, T Ohsawa IEICE Transactions on Electronics 103 (11), 685-692, 2020 | 5 | 2020 |
Co-Design of DNN Model Optimization for Binary ReRAM Array In-Memory Processing Y Guan, T Ohsawa 2019 IEEE 11th International Memory Workshop (IMW), 1-4, 2019 | 4 | 2019 |
Accelerating sparse dnns based on tiled gemm C Guo, F Xue, J Leng, Y Qiu, Y Guan, W Cui, Q Chen, M Guo IEEE Transactions on Computers, 2024 | 3 | 2024 |
Fractal: Joint Multi-Level Sparse Pattern Tuning of Accuracy and Performance for DNN Pruning Y Guan, C Yu, Y Zhou, J Leng, C Li, M Guo Proceedings of the 29th ACM International Conference on Architectural …, 2024 | 1 | 2024 |
Amanda: Unified Instrumentation Framework for Deep Neural Networks Y Guan, Y Qiu, J Leng, F Yang, S Yu, Y Liu, Y Feng, Y Zhu, L Zhou, ... Proceedings of the 29th ACM International Conference on Architectural …, 2024 | 1 | 2024 |