Finch: Enhancing federated learning with hierarchical neural architecture search

J Liu, J Yan, H Xu, Z Wang, J Huang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Federated learning (FL) has been widely adopted to train machine learning models over
massive data in edge computing. Most works of FL employ pre-defined model architectures …

Federated learning over images: vertical decompositions and pre-trained backbones are difficult to beat

E Hu, Y Tang, A Kyrillidis… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
We carefully evaluate a number of algorithms for learning in a federated environment, and
test their utility for a variety of image classification tasks. We consider many issues that have …

Efficient and light-weight federated learning via asynchronous distributed dropout

C Dun, M Hipolito, C Jermaine… - International …, 2023 - proceedings.mlr.press
Asynchronous learning protocols have regained attention lately, especially in the Federated
Learning (FL) setup, where slower clients can severely impede the learning process. Herein …

Masked training of neural networks with partial gradients

A Mohtashami, M Jaggi, S Stich - … Conference on Artificial …, 2022 - proceedings.mlr.press
State-of-the-art training algorithms for deep learning models are based on stochastic
gradient descent (SGD). Recently, many variations have been explored: perturbing …

FedP3: Federated Personalized and Privacy-friendly Network Pruning under Model Heterogeneity

K Yi, N Gazagnadou, P Richtárik, L Lyu - arXiv preprint arXiv:2404.09816, 2024 - arxiv.org
The interest in federated learning has surged in recent research due to its unique ability to
train a global model using privacy-secured information held locally on each client. This …

Towards a better theoretical understanding of independent subnetwork training

E Shulgin, P Richtárik - arXiv preprint arXiv:2306.16484, 2023 - arxiv.org
Modern advancements in large-scale machine learning would be impossible without the
paradigm of data-parallel distributed computing. Since distributed computing with large …

Neural Networks with Sparse Activation Induced by Large Bias: Tighter Analysis with Bias-Generalized NTK

H Yang, Z Jiang, R Zhang, Y Liang, Z Wang - Journal of Machine Learning …, 2024 - jmlr.org
We study training one-hidden-layer ReLU networks in the neural tangent kernel (NTK)
regime, where the networks' biases are initialized to some constant rather than zero. We …

LOFT: Finding lottery tickets through filter-wise training

Q Wang, C Dun, F Liao, C Jermaine… - International …, 2023 - proceedings.mlr.press
Recent work on the Lottery Ticket Hypothesis (LTH) shows that there exist “winning tickets”
in large neural networks. These tickets represent “sparse” versions of the full model that can …

Xtreme margin: A tunable loss function for binary classification problems

R Wali - arXiv preprint arXiv:2211.00176, 2022 - arxiv.org
Loss functions drive the optimization of machine learning algorithms. The choice of a loss
function can have a significant impact on the training of a model, and how the model learns …

Leveraging Sparse Input and Sparse Models: Efficient Distributed Learning in Resource-Constrained Environments

E Kariotakis, G Tsagkatakis… - … on Parsimony and …, 2024 - proceedings.mlr.press
Optimizing for reduced computational and bandwidth resources enables model training in
less-than-ideal environments and paves the way for practical and accessible AI solutions …