Demystifying parallel and distributed deep learning: An in-depth concurrency analysis

T Ben-Nun, T Hoefler - ACM Computing Surveys (CSUR), 2019 - dl.acm.org
Deep Neural Networks (DNNs) are becoming an important tool in modern computing
applications. Accelerating their training is a major challenge and techniques range from …

Nonconvex optimization meets low-rank matrix factorization: An overview

Y Chi, YM Lu, Y Chen - IEEE Transactions on Signal …, 2019 - ieeexplore.ieee.org
Substantial progress has been made recently on developing provably accurate and efficient
algorithms for low-rank matrix factorization via nonconvex optimization. While conventional …

Doremi: Optimizing data mixtures speeds up language model pretraining

SM Xie, H Pham, X Dong, N Du, H Liu… - Advances in …, 2024 - proceedings.neurips.cc
The mixture proportions of pretraining data domains (eg, Wikipedia, books, web text) greatly
affect language model (LM) performance. In this paper, we propose Domain Reweighting …

Personalized cross-silo federated learning on non-iid data

Y Huang, L Chu, Z Zhou, L Wang, J Liu, J Pei… - Proceedings of the …, 2021 - ojs.aaai.org
Non-IID data present a tough challenge for federated learning. In this paper, we explore a
novel idea of facilitating pairwise collaborations between clients with similar data. We …

Distributionally robust neural networks for group shifts: On the importance of regularization for worst-case generalization

S Sagawa, PW Koh, TB Hashimoto, P Liang - arXiv preprint arXiv …, 2019 - arxiv.org
Overparameterized neural networks can be highly accurate on average on an iid test set yet
consistently fail on atypical groups of the data (eg, by learning spurious correlations that …

The limitations of federated learning in sybil settings

C Fung, CJM Yoon, I Beschastnikh - 23rd International Symposium on …, 2020 - usenix.org
Federated learning over distributed multi-party data is an emerging paradigm that iteratively
aggregates updates from a group of devices to train a globally shared model. Relying on a …

A survey of optimization methods from a machine learning perspective

S Sun, Z Cao, H Zhu, J Zhao - IEEE transactions on cybernetics, 2019 - ieeexplore.ieee.org
Machine learning develops rapidly, which has made many theoretical breakthroughs and is
widely applied in various fields. Optimization, as an important part of machine learning, has …

[图书][B] Control systems and reinforcement learning

S Meyn - 2022 - books.google.com
A high school student can create deep Q-learning code to control her robot, without any
understanding of the meaning of'deep'or'Q', or why the code sometimes fails. This book is …

Position-transitional particle swarm optimization-incorporated latent factor analysis

X Luo, Y Yuan, S Chen, N Zeng… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
High-dimensional and sparse (HiDS) matrices are frequently found in various industrial
applications. A latent factor analysis (LFA) model is commonly adopted to extract useful …

Decentralized federated averaging

T Sun, D Li, B Wang - IEEE Transactions on Pattern Analysis …, 2022 - ieeexplore.ieee.org
Federated averaging (FedAvg) is a communication-efficient algorithm for distributed training
with an enormous number of clients. In FedAvg, clients keep their data locally for privacy …