关注
Jiangfei Duan
Jiangfei Duan
在 ie.cuhk.edu.hk 的电子邮件经过验证 - 首页
标题
引用次数
引用次数
年份
Spotserve: Serving generative large language models on preemptible instances
X Miao, C Shi, J Duan, X Xi, D Lin, B Cui, Z Jia
Proceedings of the 29th ACM International Conference on Architectural …, 2024
312024
SKVQ: Sliding-window Key and Value Cache Quantization for Large Language Models
H Duanmu, Z Yuan, X Li, J Duan, X Zhang, D Lin
arXiv preprint arXiv:2405.06219, 2024
52024
Centauri: Enabling Efficient Scheduling for Communication-Computation Overlap in Large Model Training via Communication Partitioning
C Chen, X Li, Q Zhu, J Duan, P Sun, X Zhang, C Yang
Proceedings of the 29th ACM International Conference on Architectural …, 2024
52024
Parcae: Proactive, Liveput-Optimized DNN Training on Preemptible Instances
J Duan, Z Song, X Miao, X Xi, D Lin, H Xu, M Zhang, Z Jia
21st USENIX Symposium on Networked Systems Design and Implementation (NSDI …, 2024
32024
MuxServe: Flexible Spatial-Temporal Multiplexing for Multiple LLM Serving
J Duan, R Lu, H Duanmu, X Li, X Zhang, D Lin, I Stoica, H Zhang
Forty-first International Conference on Machine Learning, 0
3*
Proteus: Simulating the Performance of Distributed DNN Training
J Duan, X Li, P Xu, X Zhang, S Yan, Y Liang, D Lin
IEEE Transactions on Parallel and Distributed Systems, 2024
12024
Efficient Training of Large Language Models on Distributed Infrastructures: A Survey
J Duan, S Zhang, Z Wang, L Jiang, W Qu, Q Hu, G Wang, Q Weng, H Yan, ...
arXiv preprint arXiv:2407.20018, 2024
2024
Near-Lossless Acceleration of Long Context LLM Inference with Adaptive Structured Sparse Attention
Q Zhu, J Duan, C Chen, S Liu, X Li, G Feng, X Lv, H Cao, X Chuanfu, ...
arXiv preprint arXiv:2406.15486, 2024
2024
系统目前无法执行此操作,请稍后再试。
文章 1–8