Average gradient outer product as a mechanism for deep neural collapse

D Beaglehole, P Súkeník, M Mondelli… - arXiv preprint arXiv …, 2024 - arxiv.org
Deep Neural Collapse (DNC) refers to the surprisingly rigid structure of the data
representations in the final layers of Deep Neural Networks (DNNs). Though the …

Linguistic Collapse: Neural Collapse in (Large) Language Models

R Wu, V Papyan - arXiv preprint arXiv:2405.17767, 2024 - arxiv.org
Neural collapse ($\mathcal {NC} $) is a phenomenon observed in classification tasks where
top-layer representations collapse into their class means, which become equinorm …

Wide neural networks trained with weight decay provably exhibit neural collapse

A Jacot, P Súkeník, Z Wang, M Mondelli - arXiv preprint arXiv:2410.04887, 2024 - arxiv.org
Deep neural networks (DNNs) at convergence consistently represent the training data in the
last layer via a highly symmetric geometric structure referred to as neural collapse. This …

Neural collapse for unconstrained feature model under cross-entropy loss with imbalanced data

W Hong, S Ling - arXiv preprint arXiv:2309.09725, 2023 - arxiv.org
Recent years have witnessed the huge success of deep neural networks (DNNs) in various
tasks of computer vision and text processing. Interestingly, these DNNs with massive …

Neural collapse for unconstrained feature model under cross-entropy loss with imbalanced data

W Hong, S Ling - Journal of Machine Learning Research, 2024 - jmlr.org
Neural Collapse (NC) is a fascinating phenomenon that arises during the terminal phase of
training (TPT) of deep neural networks (DNNs). Specifically, for balanced training datasets …

An Adaptive Tangent Feature Perspective of Neural Networks

D LeJeune, S Alemohammad - Conference on Parsimony …, 2024 - proceedings.mlr.press
In order to better understand feature learning in neural networks, we propose and study
linear models in tangent feature space where the features are allowed to be transformed …

Neural Collapse versus Low-rank Bias: Is Deep Neural Collapse Really Optimal?

P Súkeník, M Mondelli, C Lampert - arXiv preprint arXiv:2405.14468, 2024 - arxiv.org
Deep neural networks (DNNs) exhibit a surprising structure in their final layer known as
neural collapse (NC), and a growing body of works has currently investigated the …

Visualising Feature Learning in Deep Neural Networks by Diagonalizing the Forward Feature Map

Y Nam, C Mingard, SH Lee, S Hayou… - arXiv preprint arXiv …, 2024 - arxiv.org
Deep neural networks (DNNs) exhibit a remarkable ability to automatically learn data
representations, finding appropriate features without human input. Here we present a …

Engineering flexible machine learning systems by traversing functionally invariant paths

G Raghavan, B Tharwat, SN Hari, D Satani… - Nature Machine …, 2024 - nature.com
Contemporary machine learning algorithms train artificial neural networks by setting network
weights to a single optimized configuration through gradient descent on task-specific …

Analyzing training dynamics of deep neural networks: insights and limitations of the neural tangent kernel regime

M Seleznova - 2024 - edoc.ub.uni-muenchen.de
The widespread use of Deep Neural Networks (DNNs) in various application has
underscored their effectiveness, yet the fundamental principles behind their success largely …