Using sliced mutual information to study memorization and generalization in deep neural networks

F Hellström, G Durisi, B Guedj… - … and Trends® in …, 2025 - nowpublishers.com

A fundamental question in theoretical machine learning is generalization. Over the past
decades, the PAC-Bayesian approach has been established as a flexible framework to …

被引用次数：28 相关文章所有 9 个版本

Going deeper, generalizing better: An information-theoretic view for deep learning

J Zhang, T Liu, D Tao - IEEE Transactions on Neural Networks …, 2023 - ieeexplore.ieee.org

Deep learning has transformed computer vision, natural language processing, and speech
recognition. However, two critical questions remain obscure: 1) why do deep neural …

被引用次数：6 相关文章所有 3 个版本

[PDF] arxiv.org

Slicing Mutual Information Generalization Bounds for Neural Networks

K Nadjahi, K Greenewald, RB Gabrielsson… - arXiv preprint arXiv …, 2024 - arxiv.org

The ability of machine learning (ML) algorithms to generalize well to unseen data has been
studied through the lens of information theory, by bounding the generalization error with the …

被引用次数：1 相关文章所有 5 个版本

[PDF] arxiv.org

SoK: Memorisation in machine learning

D Usynin, M Knolle, G Kaissis - arXiv preprint arXiv:2311.03075, 2023 - arxiv.org

Quantifying the impact of individual data samples on machine learning models is an open
research problem. This is particularly relevant when complex and high-dimensional …

被引用次数：2 相关文章所有 2 个版本

[PDF] arxiv.org

Predicting and analyzing memorization within fine-tuned Large Language Models

J Dentan, D Buscaldi, A Shabou, S Vanier - arXiv preprint arXiv …, 2024 - arxiv.org

Large Language Models have received significant attention due to their abilities to solve a
wide range of complex tasks. However these models memorize a significant proportion of …

On Neural Networks Fitting, Compression, and Generalization Behavior via Information-Bottleneck-like Approaches

Z Lyu, G Aminian, MRD Rodrigues - Entropy, 2023 - mdpi.com

It is well-known that a neural network learning process—along with its connections to fitting,
compression, and generalization—is not yet well understood. In this paper, we propose a …

被引用次数：1 相关文章所有 8 个版本

Low-dimensional intrinsic dimension reveals a phase transition in gradient-based learning of deep neural networks

C Tan, J Zhang, J Liu, Z Zhao - International Journal of Machine Learning …, 2024 - Springer

Deep neural networks complete a feature extraction task by propagating the inputs through
multiple modules. However, how the representations evolve with the gradient-based …

[HTML][HTML] Understanding activation patterns in artificial neural networks by exploring stochastic processes: Discriminating generalization from memorization

SJ Lehmler, M Saif-ur-Rehman, T Glasmachers… - Neurocomputing, 2024 - Elsevier

To gain a deeper understanding of the behavior and learning dynamics of artificial neural
networks, mathematical abstractions and models are valuable. They provide a simplified …

Error Bounds of Supervised Classification from Information-Theoretic Perspective

B Qi, W Gong, L Li - arXiv preprint arXiv:2406.04567, 2024 - arxiv.org

There remains a list of unanswered research questions on deep learning (DL), including the
remarkable generalization power of overparametrized neural networks, the efficient …

Pointwise Sliced Mutual Information for Neural Network Explainability

S Wongso, R Ghosh, M Motani - 2023 IEEE International …, 2023 - ieeexplore.ieee.org

When deploying deep learning models such as convolutional neural networks (CNNs) in
safety-critical domains, it is important to understand the predictions made by these black-box …

被引用次数：1 相关文章

高级搜索

QQ 群