关注
Shiyao Li (李师尧)
Shiyao Li (李师尧)
Ph.D student, Tsinghua University
在 mails.tsinghua.edu.cn 的电子邮件经过验证 - 首页
标题
引用次数
引用次数
年份
A survey on efficient inference for large language models
Z Zhou, X Ning, K Hong, T Fu, J Xu, S Li, Y Lou, L Wang, Z Yuan, X Li, ...
arXiv preprint arXiv:2404.14294, 2024
112024
Evaluating quantized large language models
S Li, X Ning, L Wang, T Liu, X Shi, S Yan, G Dai, H Yang, Y Wang
Forty-first International Conference on Machine Learning, 2024
92024
Lv-eval: A balanced long-context benchmark with 5 length levels up to 256k
T Yuan, X Ning, D Zhou, Z Yang, S Li, M Zhuang, Z Tan, Z Yao, D Lin, B Li, ...
arXiv preprint arXiv:2402.05136, 2024
82024
Flightllm: Efficient large language model inference with a complete mapping flow on fpgas
S Zeng, J Liu, G Dai, X Yang, T Fu, H Wang, W Ma, H Sun, S Li, Z Huang, ...
Proceedings of the 2024 ACM/SIGDA International Symposium on Field …, 2024
72024
LLM-MQ: Mixed-precision Quantization for Efficient LLM Deployment
S Li, X Ning, K Hong, T Liu, L Wang, X Li, K Zhong, G Dai, H Yang, ...
NeurIPS 2023 Efficient Natural Language and Speech Processing Workshop, 2023, 0
6*
A unified FPGA virtualization framework for general-purpose deep neural networks in the cloud
S Zeng, G Dai, H Sun, J Liu, S Li, G Ge, K Zhong, K Guo, Y Wang, H Yang
ACM Transactions on Reconfigurable Technology and Systems (TRETS) 15 (3), 1-31, 2021
42021
Enabling Fast 2-bit LLM on GPUs: Memory Alignment, Sparse Outlier, and Asynchronous Dequantization
J Li, S Li, J Xu, S Huang, Y Lian, J Liu, Y Wang, G Dai
arXiv preprint arXiv:2311.16442, 2023
12023
MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression
T Fu, H Huang, X Ning, G Zhang, B Chen, T Wu, H Wang, Z Huang, S Li, ...
arXiv preprint arXiv:2406.14909, 2024
2024
Can LLMs Learn by Teaching? A Preliminary Study
X Ning, Z Wang, S Li, Z Lin, P Yao, T Fu, MB Blaschko, G Dai, H Yang, ...
arXiv preprint arXiv:2406.14629, 2024
2024
ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
T Zhao, T Fang, E Liu, W Rui, W Soedarmadji, S Li, Z Lin, G Dai, S Yan, ...
arXiv preprint arXiv:2406.02540, 2024
2024
Towards High-accuracy and Real-time Two-stage Small Object Detection on FPGA
S Li, Z Zhu, H Sun, X Ning, G Dai, Y Hu, H Yang, Y Wang
IEEE Transactions on Circuits and Systems for Video Technology, 2024
2024
TCP: Triplet Contrastive-relationship Preserving for Class-Incremental Learning
S Li, X Ning, S Zhang, L Guo, T Zhao, H Yang, Y Wang
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer …, 2024
2024
Memory-Efficient and Real-Time SPAD-based dToF Depth Sensor with Spatial and Statistical Correlation
S Li, Z Zhu, Y Zhu, Q Zhu, J Zhang, W Sun, G Dai, F Qiao, H Yang, ...
2023 60th ACM/IEEE Design Automation Conference (DAC), 1-6, 2023
2023
系统目前无法执行此操作,请稍后再试。
文章 1–13