Towards tracing trustworthiness dynamics: Revisiting pre-training period of large language models

C Qian, J Zhang, W Yao, D Liu, Z Yin, Y Qiao… - arXiv preprint arXiv …, 2024 - arxiv.org
Ensuring the trustworthiness of large language models (LLMs) is crucial. Most studies
concentrate on fully pre-trained LLMs to better understand and improve LLMs' …

Learning in PINNs: Phase transition, total diffusion, and generalization

SJ Anagnostopoulos, JD Toscano… - arXiv preprint arXiv …, 2024 - arxiv.org
We investigate the learning dynamics of fully-connected neural networks through the lens of
gradient signal-to-noise ratio (SNR), examining the behavior of first-order optimizers like …

Importance-aware information bottleneck learning paradigm for lip reading

C Sheng, L Liu, W Deng, L Bai, Z Liu… - IEEE Transactions …, 2022 - ieeexplore.ieee.org
Lip reading is the task of decoding text from speakers' mouth movements. Numerous deep
learning-based methods have been proposed to address this task. However, these existing …

Exact and soft successive refinement of the information bottleneck

H Charvin, N Catenacci Volpi, D Polani - Entropy, 2023 - mdpi.com
The information bottleneck (IB) framework formalises the essential requirement for efficient
information processing systems to achieve an optimal balance between the complexity of …

Information Bottleneck Analysis of Deep Neural Networks via Lossy Compression

I Butakov, A Tolmachev, S Malanchuk… - arXiv preprint arXiv …, 2023 - arxiv.org
The Information Bottleneck (IB) principle offers an information-theoretic framework for
analyzing the training process of deep neural networks (DNNs). Its essence lies in tracking …

Understanding and Leveraging the Learning Phases of Neural Networks

J Schneider, M Prabhushankar - … of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org
The learning dynamics of deep neural networks are not well understood. The information
bottleneck (IB) theory proclaimed separate fitting and compression phases. But they have …

Splitting of Composite Neural Networks via Proximal Operator With Information Bottleneck

SI Han, K Nakamura, BW Hong - IEEE Access, 2023 - ieeexplore.ieee.org
Deep learning has achieved efficient success in the field of machine learning, made
possible by the emergence of efficient optimization methods such as Stochastic Gradient …

End-to-End Training Induces Information Bottleneck through Layer-Role Differentiation: A Comparative Analysis with Layer-wise Training

K Sakamoto, I Sato - arXiv preprint arXiv:2402.09050, 2024 - arxiv.org
End-to-end (E2E) training, optimizing the entire model through error backpropagation,
fundamentally supports the advancements of deep learning. Despite its high performance …

On Neural Networks Fitting, Compression, and Generalization Behavior via Information-Bottleneck-like Approaches

Z Lyu, G Aminian, MRD Rodrigues - Entropy, 2023 - mdpi.com
It is well-known that a neural network learning process—along with its connections to fitting,
compression, and generalization—is not yet well understood. In this paper, we propose a …

One Flip Away from Chaos: Unraveling Single Points of Failure in Quantized DNN s

C Gongye, Y Fei - … on Hardware Oriented Security and Trust …, 2024 - ieeexplore.ieee.org
Deep Neural Networks (DNNs) have become integral to security-sensitive and mission-
critical tasks due to their remarkable performance. However, their deployment faces various …