Unleashing the power of data tsunami: A comprehensive survey on data assessment and selection for instruction tuning of language models

Y Qin, Y Yang, P Guo, G Li, H Shao, Y Shi, Z Xu… - arXiv preprint arXiv …, 2024 - arxiv.org
Instruction tuning plays a critical role in aligning large language models (LLMs) with human
preference. Despite the vast amount of open instruction datasets, naively training a LLM on …

Test accuracy vs. generalization gap: Model selection in nlp without accessing training or testing data

Y Yang, R Theisen, L Hodgkinson… - Proceedings of the 29th …, 2023 - dl.acm.org
Selecting suitable architecture parameters and training hyperparameters is essential for
enhancing machine learning (ML) model performance. Several recent empirical studies …

Spectral evolution and invariance in linear-width neural networks

Z Wang, A Engel, AD Sarwate… - Advances in neural …, 2023 - proceedings.neurips.cc
We investigate the spectral properties of linear-width feed-forward neural networks, where
the sample size is asymptotically proportional to network width. Empirically, we show that the …

A three-regime model of network pruning

Y Zhou, Y Yang, A Chang… - … on Machine Learning, 2023 - proceedings.mlr.press
Recent work has highlighted the complex influence training hyperparameters, eg, the
number of training epochs, can have on the prunability of machine learning models …

Transferable and Principled Efficiency for Open-Vocabulary Segmentation

J Xu, W Chen, Y Zhao, Y Wei - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Recent success of pre-trained foundation vision-language models makes Open-Vocabulary
Segmentation (OVS) possible. Despite the promising performance this approach introduces …

Post-mortem on a deep learning contest: a Simpson's paradox and the complementary roles of scale metrics versus shape metrics

CH Martin, MW Mahoney - arXiv preprint arXiv:2106.00734, 2021 - arxiv.org
To understand better good generalization performance in state-of-the-art neural network
(NN) models, and in particular the success of the ALPHAHAT metric based on Heavy-Tailed …

Impact of classification difficulty on the weight matrices spectra in deep learning and application to early-stopping

X Meng, J Yao - The Journal of Machine Learning Research, 2023 - dl.acm.org
Much recent research effort has been devoted to explain the success of deep learning.
Random Matrix Theory (RMT) provides an emerging way to this end by analyzing the …

Towards Scalable and Versatile Weight Space Learning

K Schürholt, MW Mahoney, D Borth - arXiv preprint arXiv:2406.09997, 2024 - arxiv.org
Learning representations of well-trained neural network models holds the promise to
provide an understanding of the inner workings of those models. However, previous work …

Hyper-Representations: Learning from Populations of Neural Networks

K Schürholt - arXiv preprint arXiv:2410.05107, 2024 - arxiv.org
This thesis addresses the challenge of understanding Neural Networks through the lens of
their most fundamental component: the weights, which encapsulate the learned information …

Using Pre-trained LLMs for Multivariate Time Series Forecasting

ML Wolff, S Yang, K Torkkola, MW Mahoney - arXiv preprint arXiv …, 2025 - arxiv.org
Pre-trained Large Language Models (LLMs) encapsulate large amounts of knowledge and
take enormous amounts of compute to train. We make use of this resource, together with the …