A neural collapse perspective on feature evolution in graph neural networks

V Kothapalli, T Tirer, J Bruna - Advances in Neural …, 2024 - proceedings.neurips.cc
Graph neural networks (GNNs) have become increasingly popular for classification tasks on
graph-structured data. Yet, the interplay between graph topology and feature evolution in …

Deep neural collapse is provably optimal for the deep unconstrained features model

P Súkeník, M Mondelli… - Advances in Neural …, 2024 - proceedings.neurips.cc
Neural collapse (NC) refers to the surprising structure of the last layer of deep neural
networks in the terminal phase of gradient descent training. Recently, an increasing amount …

Understanding deep representation learning via layerwise feature compression and discrimination

P Wang, X Li, C Yaras, Z Zhu, L Balzano, W Hu… - arXiv preprint arXiv …, 2023 - arxiv.org
Over the past decade, deep learning has proven to be a highly effective tool for learning
meaningful features from raw data. However, it remains an open question how deep …

Neural (tangent kernel) collapse

M Seleznova, D Weitzner, R Giryes… - Advances in …, 2024 - proceedings.neurips.cc
This work bridges two important concepts: the Neural Tangent Kernel (NTK), which captures
the evolution of deep neural networks (DNNs) during training, and the Neural Collapse (NC) …

Generalized neural collapse for a large number of classes

J Jiang, J Zhou, P Wang, Q Qu, D Mixon, C You… - arXiv preprint arXiv …, 2023 - arxiv.org
Neural collapse provides an elegant mathematical characterization of learned last layer
representations (aka features) and classifier weights in deep classification models. Such …

Average gradient outer product as a mechanism for deep neural collapse

D Beaglehole, P Súkeník, M Mondelli… - arXiv preprint arXiv …, 2024 - arxiv.org
Deep Neural Collapse (DNC) refers to the surprisingly rigid structure of the data
representations in the final layers of Deep Neural Networks (DNNs). Though the …

The law of parsimony in gradient descent for learning deep linear networks

C Yaras, P Wang, W Hu, Z Zhu, L Balzano… - arXiv preprint arXiv …, 2023 - arxiv.org
Over the past few years, an extensively studied phenomenon in training deep networks is
the implicit bias of gradient descent towards parsimonious solutions. In this work, we …

Linguistic Collapse: Neural Collapse in (Large) Language Models

R Wu, V Papyan - arXiv preprint arXiv:2405.17767, 2024 - arxiv.org
Neural collapse ($\mathcal {NC} $) is a phenomenon observed in classification tasks where
top-layer representations collapse into their class means, which become equinorm …

Quantifying the variability collapse of neural networks

J Xu, H Liu - International Conference on Machine Learning, 2023 - proceedings.mlr.press
Recent studies empirically demonstrate the positive relationship between the transferability
of neural networks and the in-class variation of the last layer features. The recently …

Cross entropy versus label smoothing: A neural collapse perspective

L Guo, K Ross, Z Zhao, G Andriopoulos, S Ling… - arXiv preprint arXiv …, 2024 - arxiv.org
Label smoothing loss is a widely adopted technique to mitigate overfitting in deep neural
networks. This paper studies label smoothing from the perspective of Neural Collapse (NC) …