Data pruning and neural scaling laws: fundamental limitations of score-based algorithms

N Sachdeva, J McAuley - arXiv preprint arXiv:2301.04272, 2023 - arxiv.org

The popularity of deep learning has led to the curation of a vast number of massive and
multifarious datasets. Despite having close-to-human performance on individual tasks …

被引用次数：82 相关文章所有 4 个版本

[PDF] arxiv.org

Repeated random sampling for minimizing the time-to-accuracy of learning

P Okanovic, R Waleffe, V Mageirakos… - arXiv preprint arXiv …, 2023 - arxiv.org

Methods for carefully selecting or generating a small set of training data to learn from, ie,
data pruning, coreset selection, and data distillation, have been shown to be effective in …

被引用次数：10 相关文章所有 5 个版本

[PDF] iop.org Full View

Explaining decision structures and data value for neural networks in crop yield prediction

M von Bloh, B Seiler, P van der Smagt… - Environmental …, 2024 - iopscience.iop.org

Neural networks are powerful machine learning models, but their reliability and trust are
often criticized due to the unclear nature of their internal learned relationships. We explored …

All models are wrong, some are useful: Model Selection with Limited Labels

P Okanovic, A Kirsch, J Kasper, T Hoefler… - arXiv preprint arXiv …, 2024 - arxiv.org

We introduce MODEL SELECTOR, a framework for label-efficient selection of pretrained
classifiers. Given a pool of unlabeled target data, MODEL SELECTOR samples a small …

Robust Data Pruning: Uncovering and Overcoming Implicit Bias

A Vysogorets, K Ahuja, J Kempe - arXiv preprint arXiv:2404.05579, 2024 - arxiv.org

In the era of exceptionally data-hungry models, careful selection of the training data is
essential to mitigate the extensive costs of deep learning. Data pruning offers a solution by …

被引用次数：2 相关文章所有 2 个版本

[PDF] escholarship.org

Towards Data-efficient Machine Learning Systems

N Sachdeva - 2024 - search.proquest.com

The amount of data available to train modern machine learning systems has been
increasing rapidly, so much so that we're using, eg, entirety of the publicly available text data …

[PDF][PDF] Repeated Random Sampling for Data Efficient Learning

P Okanovic - 2023 - research-collection.ethz.ch

Deep learning has shown great success in several areas, including speech recognition,
natural language processing, and computer vision, but its effectiveness significantly …

高级搜索

QQ 群