作者
Qinghao Hu, Zhisheng Ye, Meng Zhang, Qiaoling Chen, Peng Sun, Yonggang Wen, Tianwei Zhang
发表日期
2023
研讨会论文
17th USENIX Symposium on Operating Systems Design and Implementation (OSDI 23)
页码范围
757-777
简介
Hyperparameter tuning is an essential step in deep learning model development that provides better model performance at the cost of substantial resources. While existing systems can improve tuning efficiency, they still fail to handle large models with billions of parameters and efficiently leverage cluster resources. Motivated by these deficiencies, we introduce Hydro, a surrogate-based hyperparameter tuning service that optimizes tuning workloads in both the job-level and cluster-level granularities. Specifically, it consists of two key components:(1) Hydro Tuner automatically generates and optimizes surrogate models via scaling, parametrization and fusion;(2) Hydro Coordinator improves tuning efficiency and cluster-wide resource utilization by adaptively leveraging ephemeral and heterogeneous resources. Our comprehensive experiments on two tuning algorithms across six models show that Hydro Tuner can dramatically reduce tuning makespan by up to 78.5 x compared with Ray Tune and no reduction in tuning quality. Hydro's source code is publicly available at https://github. com/S-Lab-System-Group/Hydro.
引用总数
学术搜索中的文章
Q Hu, Z Ye, M Zhang, Q Chen, P Sun, Y Wen, T Zhang - 17th USENIX Symposium on Operating Systems …, 2023