Mechanistic mode connectivity

ES Lubana, EJ Bigelow, RP Dick… - International …, 2023 - proceedings.mlr.press
We study neural network loss landscapes through the lens of mode connectivity, the
observation that minimizers of neural networks retrieved via training on a dataset are …

Unleashing the power of data tsunami: A comprehensive survey on data assessment and selection for instruction tuning of language models

Y Qin, Y Yang, P Guo, G Li, H Shao, Y Shi, Z Xu… - arXiv preprint arXiv …, 2024 - arxiv.org
Instruction tuning plays a critical role in aligning large language models (LLMs) with human
preference. Despite the vast amount of open instruction datasets, naively training a LLM on …

T-mars: Improving visual representations by circumventing text feature learning

P Maini, S Goyal, ZC Lipton, JZ Kolter… - arXiv preprint arXiv …, 2023 - arxiv.org
Large web-sourced multimodal datasets have powered a slew of new methods for learning
general-purpose visual representations, advancing the state of the art in computer vision …

Learning and forgetting unsafe examples in large language models

J Zhao, Z Deng, D Madras, J Zou, M Ren - arXiv preprint arXiv:2312.12736, 2023 - arxiv.org
As the number of large language models (LLMs) released to the public grows, there is a
pressing need to understand the safety implications associated with these models learning …

Beyond confidence: Reliable models should also consider atypicality

M Yuksekgonul, L Zhang, JY Zou… - Advances in Neural …, 2024 - proceedings.neurips.cc
While most machine learning models can provide confidence in their predictions, confidence
is insufficient to understand a prediction's reliability. For instance, the model may have a low …

Late stopping: Avoiding confidently learning from mislabeled examples

S Yuan, L Feng, T Liu - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Sample selection is a prevalent method in learning with noisy labels, where small-loss data
are typically considered as correctly labeled data. However, this method may not effectively …

Can neural network memorization be localized?

P Maini, MC Mozer, H Sedghi, ZC Lipton… - arXiv preprint arXiv …, 2023 - arxiv.org
Recent efforts at explaining the interplay of memorization and generalization in deep
overparametrized networks have posited that neural networks $\textit {memorize} $" hard" …

Triage: Characterizing and auditing training data for improved regression

N Seedat, J Crabbé, Z Qian… - Advances in Neural …, 2023 - proceedings.neurips.cc
Data quality is crucial for robust machine learning algorithms, with the recent interest in data-
centric AI emphasizing the importance of training data characterization. However, current …

Early stopping against label noise without validation data

S Yuan, L Feng, T Liu - The Twelfth International Conference on …, 2024 - openreview.net
Early stopping methods in deep learning face the challenge of balancing the volume of
training and validation data, especially in the presence of label noise. Concretely, sparing …

Memorization through the lens of curvature of loss function around samples

I Garg, D Ravikumar, K Roy - arXiv preprint arXiv:2307.05831, 2023 - arxiv.org
Deep neural networks are over-parameterized and easily overfit the datasets they train on.
In the extreme case, it has been shown that these networks can memorize a training set with …