关注
Xinhao Cheng
Xinhao Cheng
Master student at Carnegie Mellon University
在 andrew.cmu.edu 的电子邮件经过验证 - 首页
标题
引用次数
引用次数
年份
SpecInfer: Accelerating Generative Large Language Model Serving with Tree-based Speculative Inference and Verification
X Miao, G Oliaro, Z Zhang, X Cheng, Z Wang, Z Zhang, RYY Wong, A Zhu, ...
Proceedings of the 29th ACM International Conference on Architectural …, 2023
88*2023
Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems
X Miao, G Oliaro, Z Zhang, X Cheng, H Jin, T Chen, Z Jia
arXiv preprint arXiv:2312.15234, 2023
362023
FlexLLM: A System for Co-Serving Large Language Model Inference and Parameter-Efficient Finetuning
X Miao, G Oliaro, X Cheng, M Wu, C Unger, Z Jia
arXiv preprint arXiv:2402.18789, 2024
22024
A Multi-Level Superoptimizer for Tensor Programs
M Wu, X Cheng, O Padon, Z Jia
arXiv preprint arXiv:2405.05751, 2024
2024
系统目前无法执行此操作,请稍后再试。
文章 1–4