Robust stochastic approximation approach to stochastic programming

T Ben-Nun, T Hoefler - ACM Computing Surveys (CSUR), 2019 - dl.acm.org

Deep Neural Networks (DNNs) are becoming an important tool in modern computing
applications. Accelerating their training is a major challenge and techniques range from …

被引用次数：825 相关文章所有 28 个版本

[PDF] ieee.org

Nonconvex optimization meets low-rank matrix factorization: An overview

Y Chi, YM Lu, Y Chen - IEEE Transactions on Signal …, 2019 - ieeexplore.ieee.org

Substantial progress has been made recently on developing provably accurate and efficient
algorithms for low-rank matrix factorization via nonconvex optimization. While conventional …

被引用次数：485 相关文章所有 13 个版本

[PDF] neurips.cc

Doremi: Optimizing data mixtures speeds up language model pretraining

SM Xie, H Pham, X Dong, N Du, H Liu… - Advances in …, 2024 - proceedings.neurips.cc

The mixture proportions of pretraining data domains (eg, Wikipedia, books, web text) greatly
affect language model (LM) performance. In this paper, we propose Domain Reweighting …

被引用次数：85 相关文章所有 6 个版本

[PDF] aaai.org

Personalized cross-silo federated learning on non-iid data

Y Huang, L Chu, Z Zhou, L Wang, J Liu, J Pei… - Proceedings of the …, 2021 - ojs.aaai.org

Non-IID data present a tough challenge for federated learning. In this paper, we explore a
novel idea of facilitating pairwise collaborations between clients with similar data. We …

被引用次数：573 相关文章所有 10 个版本

[PDF] openreview.net

Distributionally robust neural networks for group shifts: On the importance of regularization for worst-case generalization

S Sagawa, PW Koh, TB Hashimoto, P Liang - arXiv preprint arXiv …, 2019 - arxiv.org

Overparameterized neural networks can be highly accurate on average on an iid test set yet
consistently fail on atypical groups of the data (eg, by learning spurious correlations that …

被引用次数：1619 相关文章所有 4 个版本

[PDF] usenix.org

The limitations of federated learning in sybil settings

C Fung, CJM Yoon, I Beschastnikh - 23rd International Symposium on …, 2020 - usenix.org

Federated learning over distributed multi-party data is an emerging paradigm that iteratively
aggregates updates from a group of devices to train a globally shared model. Relying on a …

被引用次数：358 相关文章所有 7 个版本

[PDF] arxiv.org

A survey of optimization methods from a machine learning perspective

S Sun, Z Cao, H Zhu, J Zhao - IEEE transactions on cybernetics, 2019 - ieeexplore.ieee.org

Machine learning develops rapidly, which has made many theoretical breakthroughs and is
widely applied in various fields. Optimization, as an important part of machine learning, has …

被引用次数：802 相关文章所有 9 个版本

[图书][B] Control systems and reinforcement learning

S Meyn - 2022 - books.google.com

A high school student can create deep Q-learning code to control her robot, without any
understanding of the meaning of'deep'or'Q', or why the code sometimes fails. This book is …

被引用次数：133 相关文章所有 3 个版本

[PDF] github.io

Position-transitional particle swarm optimization-incorporated latent factor analysis

X Luo, Y Yuan, S Chen, N Zeng… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org

High-dimensional and sparse (HiDS) matrices are frequently found in various industrial
applications. A latent factor analysis (LFA) model is commonly adopted to extract useful …

被引用次数：249 相关文章所有 3 个版本

[PDF] arxiv.org

Decentralized federated averaging

T Sun, D Li, B Wang - IEEE Transactions on Pattern Analysis …, 2022 - ieeexplore.ieee.org

Federated averaging (FedAvg) is a communication-efficient algorithm for distributed training
with an enormous number of clients. In FedAvg, clients keep their data locally for privacy …

被引用次数：204 相关文章所有 10 个版本

高级搜索

QQ 群