Selecting suitable architecture parameters and training hyperparameters is essential for enhancing machine learning (ML) model performance. Several recent empirical studies …
We investigate the spectral properties of linear-width feed-forward neural networks, where the sample size is asymptotically proportional to network width. Empirically, we show that the …
Y Zhou, Y Yang, A Chang… - … on Machine Learning, 2023 - proceedings.mlr.press
Recent work has highlighted the complex influence training hyperparameters, eg, the number of training epochs, can have on the prunability of machine learning models …
J Xu, W Chen, Y Zhao, Y Wei - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Recent success of pre-trained foundation vision-language models makes Open-Vocabulary Segmentation (OVS) possible. Despite the promising performance this approach introduces …
To understand better good generalization performance in state-of-the-art neural network (NN) models, and in particular the success of the ALPHAHAT metric based on Heavy-Tailed …
X Meng, J Yao - The Journal of Machine Learning Research, 2023 - dl.acm.org
Much recent research effort has been devoted to explain the success of deep learning. Random Matrix Theory (RMT) provides an emerging way to this end by analyzing the …
Learning representations of well-trained neural network models holds the promise to provide an understanding of the inner workings of those models. However, previous work …
K Schürholt - arXiv preprint arXiv:2410.05107, 2024 - arxiv.org
This thesis addresses the challenge of understanding Neural Networks through the lens of their most fundamental component: the weights, which encapsulate the learned information …
ML Wolff, S Yang, K Torkkola, MW Mahoney - arXiv preprint arXiv …, 2025 - arxiv.org
Pre-trained Large Language Models (LLMs) encapsulate large amounts of knowledge and take enormous amounts of compute to train. We make use of this resource, together with the …