Adaptive message quantization and parallelization for distributed full-graph gnn training B Wan, J Zhao, C Wu Proceedings of Machine Learning and Systems 5, 2023 | 22 | 2023 |
LLM-PQ: Serving LLM on Heterogeneous Clusters with Phase-Aware Partition and Adaptive Quantization J Zhao, B Wan, Y Peng, H Lin, C Wu arXiv preprint arXiv:2403.01136, 2024 | 9 | 2024 |
ByteCheckpoint: A Unified Checkpointing System for Large Foundation Model Development B Wan, M Han, Y Sheng, Y Peng, H Lin, M Zhang, Z Lai, Y Menghan, ... arXiv preprint arXiv:2407.20143, 2024 | 3* | 2024 |
POSTER: LLM-PQ: Serving LLM on Heterogeneous Clusters with Phase-Aware Partition and Adaptive Quantization J Zhao, B Wan, C Wu, Y Peng, H Lin Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and …, 2024 | 1 | 2024 |
QSync: Quantization-Minimized Synchronous Distributed Training Across Hybrid Devices J Zhao, B Wan, Y Peng, H Lin, Y Zhu, C Wu 2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2024 | | 2024 |