作者
Kun Yang, Jieyu Lin, Wei Ni, Liang Song
发表日期
2021/11/19
研讨会论文
2021 International Conference on Networking Systems of AI (INSAI)
页码范围
176-180
出版商
IEEE
简介
In recent years, deep learning algorithms have shown a trend towards larger models and larger datasets. Centralized training is unable keep up with the training requirements due to limited storage and computing resources, thus distributed learning is becoming an important area of research for improving learning efficiency. There are many studies on using the features of deep learning workload to design a central scheduler for production clusters.While existing work has been focusing on overall completion time and resource efficiency, little attention has been paid to the execution deadlines. To achieve a balance between the goals of deadline and non-deadline jobs, we design a Two-level Information-Agnostic Scheduling strategy(TIAS), which can schedule the two kinds of jobs together without knowing jobs’ training duration. In the first level, we use different priority calculation methods for the two kinds of jobs; in …
引用总数
学术搜索中的文章
K Yang, J Lin, W Ni, L Song - 2021 International Conference on Networking Systems …, 2021