Decoupled greedy learning of cnns for synchronous and asynchronous distributed learning

L Fournier, S Rivaud, E Belilovsky… - International …, 2023 - proceedings.mlr.press

Forward Gradients-the idea of using directional derivatives in forward differentiation mode-
have recently been shown to be utilizable for neural network training while avoiding …

被引用次数：14 相关文章所有 17 个版本

[PDF] mlr.press

DADAO: Decoupled accelerated decentralized asynchronous optimization

A Nabli, E Oyallon - International Conference on Machine …, 2023 - proceedings.mlr.press

This work introduces DADAO: the first decentralized, accelerated, asynchronous, primal, first-
order algorithm to minimize a sum of $ L $-smooth and $\mu $-strongly convex functions …

被引用次数：13 相关文章所有 34 个版本

[PDF] neurips.cc

: Accelerating Asynchronous Communication in Decentralized Deep Learning

A Nabli, E Belilovsky, E Oyallon - Advances in Neural …, 2024 - proceedings.neurips.cc

Abstract Distributed training of Deep Learning models has been critical to many recent
successes in the field. Current standard methods primarily rely on synchronous centralized …

被引用次数：4 相关文章所有 23 个版本

[PDF] hal.science

[PDF][PDF] A2CiD2: Accelerating Asynchronous Communication in Decentralized Deep Learning.

A Nabli, E Belilovsky, E Oyallon - NeurIPS, 2023 - hal.science

Abstract Distributed training of Deep Learning models has been critical to many recent
successes in the field. Current standard methods primarily rely on synchronous centralized …

被引用次数：6 相关文章所有 3 个版本

[PDF] arxiv.org

Brain-Inspired Machine Intelligence: A Survey of Neurobiologically-Plausible Credit Assignment

AG Ororbia - arXiv preprint arXiv:2312.09257, 2023 - arxiv.org

In this survey, we examine algorithms for conducting credit assignment in artificial neural
networks that are inspired or motivated by neurobiology, unifying these various processes …

被引用次数：8 相关文章所有 6 个版本

[PDF] arxiv.org

PETRA: Parallel End-to-end Training with Reversible Architectures

S Rivaud, L Fournier, T Pumir, E Belilovsky… - arXiv preprint arXiv …, 2024 - arxiv.org

Reversible architectures have been shown to be capable of performing on par with their non-
reversible architectures, being applied in deep learning for memory savings and generative …

Preventing dimensional collapse in contrastive local learning with subsampling

L Fournier, A Patel, M Eickenberg, E Oyallon… - ICML 2023 Workshop …, 2023 - hal.science

This paper presents an investigation of the challenges of training Deep Neural Networks
(DNNs) via self-supervised objectives, using local learning as a parallelizable alternative to …

被引用次数：3 相关文章所有 8 个版本

[PDF] arxiv.org

Layer‐parallel training of residual networks with auxiliary variable networks

Q Sun, H Dong, Z Chen, J Sun, Z Li… - Numerical Methods for …, 2024 - Wiley Online Library

Gradient‐based methods for training residual networks (ResNets) typically require a forward
pass of input data, followed by back‐propagating the error gradient to update model …

被引用次数：1 相关文章所有 2 个版本

[PDF] hal.science

Contributions to Local, Asynchronous and Decentralized Learning, and to Geometric Deep Learning

E Oyallon - 2023 - hal.science

This document is a summary of some of the research I conducted from 2018 to 2023 to
obtain the Habilitation à Diriger des Recherches. All the results mentioned are discussed in …

Fully-Decentralized Training of GNNs using Layer-wise Self-Supervision

L Giaretta, S Girdzijauskas - 2023 - diva-portal.org

In existing literature, GNN training has been performed mostly in centralized, and sometimes
federated, settings. In this work, we consider a fully-decentralized data-private scenario …

高级搜索

QQ 群