关注
Peng Sun
Peng Sun
PhD in Computer Science, Princeton University
在 cs.princeton.edu 的电子邮件经过验证 - 首页
标题
引用次数
年份
AutoSched: An Adaptive Self-configured Framework for Scheduling Deep Learning Training Workloads
W Gao, X Zhang, S Huang, S Guo, P Sun, Y Wen, T Zhang
Proceedings of the 38th ACM International Conference on Supercomputing, 473-484, 2024
12024
Ymir: A Scheduler for Foundation Model Fine-tuning Workloads in Datacenters
W Gao, W Zhuang, M Li, P Sun, Y Wen, T Zhang
Proceedings of the 38th ACM International Conference on Supercomputing, 259-271, 2024
2024
FedDSE: Distribution-aware Sub-model Extraction for Federated Learning over Resource-constrained Devices
H Wang, Y Jia, M Zhang, Q Hu, H Ren, P Sun, Y Wen, T Zhang
Proceedings of the ACM on Web Conference 2024, 2902-2913, 2024
2024
Centauri: Enabling Efficient Scheduling for Communication-Computation Overlap in Large Model Training via Communication Partitioning
C Chen, X Li, Q Zhu, J Duan, P Sun, X Zhang, C Yang
Proceedings of the 29th ACM International Conference on Architectural …, 2024
12024
LoongServe: Efficiently Serving Long-context Large Language Models with Elastic Sequence Parallelism
B Wu, S Liu, Y Zhong, P Sun, X Liu, X Jin
arXiv preprint arXiv:2404.09526, 2024
12024
Internlm2 technical report
Z Cai, M Cao, H Chen, K Chen, K Chen, X Chen, X Chen, Z Chen, Z Chen, ...
arXiv preprint arXiv:2403.17297, 2024
222024
UniSched: A Unified Scheduler for Deep Learning Training Jobs with Different User Demands
W Gao, Z Ye, P Sun, T Zhang, Y Wen
IEEE Transactions on Computers, 2024
12024
Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning
Z Xi, W Chen, B Hong, S Jin, R Zheng, W He, Y Ding, S Liu, X Guo, ...
arXiv preprint arXiv:2402.05808, 2024
22024
Deep Learning Workload Scheduling in GPU Datacenters: A Survey
Z Ye, W Gao, Q Hu, P Sun, X Wang, Y Luo, T Zhang, Y Wen
ACM Computing Surveys 56 (6), 1-38, 2024
32024
InternEvo: Efficient Long-sequence Large Language Model Training via Hybrid Parallelism and Redundant Sharding
Q Chen, D Gu, G Wang, X Chen, YT Xiong, T Huang, Q Hu, X Jin, Y Wen, ...
arXiv preprint arXiv:2401.09149, 2024
12024
Characterization of large language model development in the datacenter
Q Hu, Z Ye, Z Wang, G Wang, M Zhang, Q Chen, P Sun, D Lin, X Wang, ...
21st USENIX Symposium on Networked Systems Design and Implementation (NSDI …, 2024
82024
AMSP: Super-Scaling LLM Training via Advanced Model States Partitioning
Q Chen, Q Hu, Z Ye, G Wang, P Sun, Y Wen, T Zhang
arXiv preprint arXiv:2311.00257, 2023
12023
Boosting distributed full-graph gnn training with asynchronous one-bit communication
M Zhang, Q Hu, P Sun, Y Wen, T Zhang
arXiv preprint arXiv:2303.01277, 2023
62023
Lucid: A non-intrusive, scalable and interpretable scheduler for deep learning training jobs
Q Hu, M Zhang, P Sun, Y Wen, T Zhang
Proceedings of the 28th ACM International Conference on Architectural …, 2023
122023
Hydro:{Surrogate-Based} Hyperparameter Tuning Service in Datacenters
Q Hu, Z Ye, M Zhang, Q Chen, P Sun, Y Wen, T Zhang
17th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2023
42023
Titan: a scheduler for foundation model fine-tuning workloads
W Gao, P Sun, Y Wen, T Zhang
Proceedings of the 13th Symposium on Cloud Computing, 348-354, 2022
42022
Streaming analytics using a serverless compute system
V Sood, J Liu, M Saroop, TM Ghadge, H Sharma, N Vommi, TC Olychuck, ...
US Patent 11,388,210, 2022
12022
Deep learning workload scheduling in gpu datacenters: Taxonomy, challenges and vision
W Gao, Q Hu, Z Ye, P Sun, X Wang, Y Luo, T Zhang, Y Wen
arXiv preprint arXiv:2205.11913, 2022
232022
Establishment and characterization of human induced pluripotent stem cell line from a Parkinson’s disease patient harboring VPS13A gene mutation
X Lu, W Wang, Y Liu, N Song, M Li, X Mu, N Zhang, Q Chen, L Jiang, ...
Stem Cell Research 60, 102685, 2022
12022
A Simulation Platform for Multi-tenant Machine Learning Services on Thousands of GPUs
R Liang, B He, S Yan, P Sun
arXiv preprint arXiv:2201.03175, 2022
2022
系统目前无法执行此操作,请稍后再试。
文章 1–20