关注
Chunan Shi
Chunan Shi
在 pku.edu.cn 的电子邮件经过验证
标题
引用次数
引用次数
年份
SpecInfer: Accelerating Generative Large Language Model Serving with Tree-based Speculative Inference and Verification
X Miao, G Oliaro, Z Zhang, X Cheng, Z Wang, Z Zhang, RYY Wong, A Zhu, ...
arXiv preprint arXiv:2305.09781, 2023
692023
Galvatron: Efficient transformer training over multiple gpus using automatic parallelism
X Miao, Y Wang, Y Jiang, C Shi, X Nie, H Zhang, B Cui
arXiv preprint arXiv:2211.13878, 2022
362022
Spotserve: Serving generative large language models on preemptible instances
X Miao, C Shi, J Duan, X Xi, D Lin, B Cui, Z Jia
Proceedings of the 29th ACM International Conference on Architectural …, 2024
242024
Specinfer: Accelerating large language model serving with tree-based speculative inference and verification
X Miao, G Oliaro, Z Zhang, X Cheng, Z Wang, Z Zhang, RYY Wong, A Zhu, ...
Proceedings of the 29th ACM International Conference on Architectural …, 2024
202024
Clover: Regressive Lightweight Speculative Decoding with Sequential Knowledge
B Xiao, C Shi, X Nie, F Yang, X Deng, L Su, W Chen, B Cui
arXiv preprint arXiv:2405.00263, 2024
22024
系统目前无法执行此操作,请稍后再试。
文章 1–5