关注
Guangxuan Xiao
Guangxuan Xiao
Ph.D. student, MIT
在 mit.edu 的电子邮件经过验证 - 首页
标题
引用次数
引用次数
年份
SmoothQuant: Accurate and efficient post-training quantization for large language models
G Xiao, J Lin, M Seznec, H Wu, J Demouth, S Han
International Conference on Machine Learning, 38087-38099, 2023
3862023
Awq: Activation-aware weight quantization for llm compression and acceleration
J Lin, J Tang, H Tang, S Yang, WM Chen, WC Wang, G Xiao, X Dang, ...
MLSys 2024, 2023
237*2023
Efficient streaming language models with attention sinks
G Xiao, Y Tian, B Chen, S Han, M Lewis
International Conference on Learning Representations (ICLR), 2024
1562024
Fastcomposer: Tuning-free multi-subject image generation with localized attention
G Xiao, T Yin, WT Freeman, F Durand, S Han
arXiv preprint arXiv:2305.10431, 2023
882023
Qserve: W4a8kv4 quantization and system co-design for efficient llm serving
Y Lin, H Tang, S Yang, Z Zhang, G Xiao, C Gan, S Han
arXiv preprint arXiv:2405.04532, 2024
42024
Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference
J Tang, Y Zhao, K Zhu, G Xiao, B Kasikci, S Han
arXiv preprint arXiv:2406.10774, 2024
12024
系统目前无法执行此操作,请稍后再试。
文章 1–6