Linear convergence analysis of neural collapse with unconstrained features

P Súkeník, M Mondelli… - Advances in Neural …, 2024 - proceedings.neurips.cc

Neural collapse (NC) refers to the surprising structure of the last layer of deep neural
networks in the terminal phase of gradient descent training. Recently, an increasing amount …

被引用次数：17 相关文章所有 6 个版本

[PDF] arxiv.org

Principled and efficient transfer learning of deep models via neural collapse

X Li, S Liu, J Zhou, X Lu, C Fernandez-Granda… - arXiv preprint arXiv …, 2022 - arxiv.org

As model size continues to grow and access to labeled training data remains limited,
transfer learning has become a popular approach in many scientific and engineering fields …

被引用次数：29 相关文章所有 2 个版本

[PDF] arxiv.org

Understanding deep representation learning via layerwise feature compression and discrimination

P Wang, X Li, C Yaras, Z Zhu, L Balzano, W Hu… - arXiv preprint arXiv …, 2023 - arxiv.org

Over the past decade, deep learning has proven to be a highly effective tool for learning
meaningful features from raw data. However, it remains an open question how deep …

被引用次数：12 相关文章所有 2 个版本

[PDF] arxiv.org

Neural collapse in multi-label learning with pick-all-label loss

P Li, X Li, Y Wang, Q Qu - arXiv preprint arXiv:2310.15903, 2023 - arxiv.org

We study deep neural networks for the multi-label classification (MLab) task through the lens
of neural collapse (NC). Previous works have been restricted to the multi-class classification …

被引用次数：5 相关文章所有 4 个版本

[PDF] arxiv.org

Average gradient outer product as a mechanism for deep neural collapse

D Beaglehole, P Súkeník, M Mondelli… - arXiv preprint arXiv …, 2024 - arxiv.org

Deep Neural Collapse (DNC) refers to the surprisingly rigid structure of the data
representations in the final layers of Deep Neural Networks (DNNs). Though the …

被引用次数：9 相关文章所有 2 个版本

[PDF] arxiv.org

The law of parsimony in gradient descent for learning deep linear networks

C Yaras, P Wang, W Hu, Z Zhu, L Balzano… - arXiv preprint arXiv …, 2023 - arxiv.org

Over the past few years, an extensively studied phenomenon in training deep networks is
the implicit bias of gradient descent towards parsimonious solutions. In this work, we …

被引用次数：14 相关文章所有 2 个版本

[PDF] arxiv.org

Linguistic Collapse: Neural Collapse in (Large) Language Models

R Wu, V Papyan - arXiv preprint arXiv:2405.17767, 2024 - arxiv.org

Neural collapse ($\mathcal {NC} $) is a phenomenon observed in classification tasks where
top-layer representations collapse into their class means, which become equinorm …

被引用次数：6 相关文章所有 2 个版本

[PDF] neurips.cc

Residual alignment: uncovering the mechanisms of residual networks

J Li, V Papyan - Advances in Neural Information Processing …, 2024 - proceedings.neurips.cc

The ResNet architecture has been widely adopted in deep learning due to its significant
boost to performance through the use of simple skip connections, yet the underlying …

被引用次数：1 相关文章所有 5 个版本

[PDF] arxiv.org

A Global Geometric Analysis of Maximal Coding Rate Reduction

P Wang, H Liu, D Pai, Y Yu, Z Zhu, Q Qu… - arXiv preprint arXiv …, 2024 - arxiv.org

The maximal coding rate reduction (MCR $^ 2$) objective for learning structured and
compact deep representations is drawing increasing attention, especially after its recent …

被引用次数：4 相关文章所有 3 个版本

[PDF] arxiv.org

Wide neural networks trained with weight decay provably exhibit neural collapse

A Jacot, P Súkeník, Z Wang, M Mondelli - arXiv preprint arXiv:2410.04887, 2024 - arxiv.org

Deep neural networks (DNNs) at convergence consistently represent the training data in the
last layer via a highly symmetric geometric structure referred to as neural collapse. This …

被引用次数：1 相关文章所有 2 个版本

高级搜索

QQ 群