Can forward gradient match backpropagation?

L Fournier, S Rivaud, E Belilovsky… - International …, 2023 - proceedings.mlr.press
Forward Gradients-the idea of using directional derivatives in forward differentiation mode-
have recently been shown to be utilizable for neural network training while avoiding …

DADAO: Decoupled accelerated decentralized asynchronous optimization

A Nabli, E Oyallon - International Conference on Machine …, 2023 - proceedings.mlr.press
This work introduces DADAO: the first decentralized, accelerated, asynchronous, primal, first-
order algorithm to minimize a sum of $ L $-smooth and $\mu $-strongly convex functions …

: Accelerating Asynchronous Communication in Decentralized Deep Learning

A Nabli, E Belilovsky, E Oyallon - Advances in Neural …, 2024 - proceedings.neurips.cc
Abstract Distributed training of Deep Learning models has been critical to many recent
successes in the field. Current standard methods primarily rely on synchronous centralized …

[PDF][PDF] A2CiD2: Accelerating Asynchronous Communication in Decentralized Deep Learning.

A Nabli, E Belilovsky, E Oyallon - NeurIPS, 2023 - hal.science
Abstract Distributed training of Deep Learning models has been critical to many recent
successes in the field. Current standard methods primarily rely on synchronous centralized …

Brain-Inspired Machine Intelligence: A Survey of Neurobiologically-Plausible Credit Assignment

AG Ororbia - arXiv preprint arXiv:2312.09257, 2023 - arxiv.org
In this survey, we examine algorithms for conducting credit assignment in artificial neural
networks that are inspired or motivated by neurobiology, unifying these various processes …

PETRA: Parallel End-to-end Training with Reversible Architectures

S Rivaud, L Fournier, T Pumir, E Belilovsky… - arXiv preprint arXiv …, 2024 - arxiv.org
Reversible architectures have been shown to be capable of performing on par with their non-
reversible architectures, being applied in deep learning for memory savings and generative …

Preventing dimensional collapse in contrastive local learning with subsampling

L Fournier, A Patel, M Eickenberg, E Oyallon… - ICML 2023 Workshop …, 2023 - hal.science
This paper presents an investigation of the challenges of training Deep Neural Networks
(DNNs) via self-supervised objectives, using local learning as a parallelizable alternative to …

Layer‐parallel training of residual networks with auxiliary variable networks

Q Sun, H Dong, Z Chen, J Sun, Z Li… - Numerical Methods for …, 2024 - Wiley Online Library
Gradient‐based methods for training residual networks (ResNets) typically require a forward
pass of input data, followed by back‐propagating the error gradient to update model …

Contributions to Local, Asynchronous and Decentralized Learning, and to Geometric Deep Learning

E Oyallon - 2023 - hal.science
This document is a summary of some of the research I conducted from 2018 to 2023 to
obtain the Habilitation à Diriger des Recherches. All the results mentioned are discussed in …

Fully-Decentralized Training of GNNs using Layer-wise Self-Supervision

L Giaretta, S Girdzijauskas - 2023 - diva-portal.org
In existing literature, GNN training has been performed mostly in centralized, and sometimes
federated, settings. In this work, we consider a fully-decentralized data-private scenario …