[PDF][PDF] Training nonlinear transformers for efficient in-context learning: A theoretical learning and generalization analysis

H Li, M Wang, S Lu, X Cui, PY Chen - arXiv preprint arXiv …, 2024 - researchgate.net
Transformer-based large language models have displayed impressive in-context learning
capabilities, where a pre-trained model can handle new tasks without fine-tuning by simply …

Learning on transformers is provable low-rank and sparse: A one-layer analysis

H Li, M Wang, S Zhang, S Liu… - 2024 IEEE 13rd Sensor …, 2024 - ieeexplore.ieee.org
Efficient training and inference algorithms, such as low-rank adaption and model pruning,
have shown impressive performance for learning Transformer-based large foundation …

Quantifying lottery tickets under label noise: accuracy, calibration, and complexity

V Arora, D Irto, S Goldt… - Uncertainty in Artificial …, 2023 - proceedings.mlr.press
Pruning deep neural networks is a widely used strategy to alleviate the computational
burden in machine learning. Overwhelming empirical evidence suggests that pruned …

How Do Nonlinear Transformers Learn and Generalize in In-Context Learning?

H Li, M Wang, S Lu, X Cui, PY Chen - Forty-first International Conference … - openreview.net
Transformer-based large language models have displayed impressive in-context learning
capabilities, where a pre-trained model can handle new tasks without fine-tuning by simply …

How Sparse Can We Prune A Deep Network: A Fundamental Limit Viewpoint

Q Zhang, R Zhang, J Sun, Y Liu - arXiv preprint arXiv:2306.05857, 2023 - arxiv.org
Network pruning is an effective measure to alleviate the storage and computational burden
of deep neural networks arising from its high overparameterization. Thus raises a …

How Sparse Can We Prune A Deep Network: A Geometric Viewpoint

Q Zhang, R Zhang, J Sun, Y Liu - 2023 - openreview.net
Network pruning constitutes an effective measure to alleviate the storage and computational
burden of deep neural networks which arises from its overparameterization. A fundamental …