Test accuracy vs. generalization gap: Model selection in nlp without accessing training or...

Y Zhou, T Pang, K Liu… - Advances in Neural …, 2024 - proceedings.neurips.cc

Regularization in modern machine learning is crucial, and it can take various forms in
algorithmic design: training set, model family, error function, regularization terms, and …

被引用次数：11 相关文章所有 6 个版本

AI integration in construction safety: Current state, challenges, and future opportunities in text, vision, and audio based applications

ABK Rabbi, I Jeelani - Automation in Construction, 2024 - Elsevier

High occupational injury and fatality rate in the construction industry is a serious global
concern. Recognizing AI as a solution to enhance safety performance, this study reviews …

被引用次数：8 相关文章

[PDF] arxiv.org

Alphapruning: Using heavy-tailed self regularization theory for improved layer-wise pruning of large language models

H Lu, Y Zhou, S Liu, Z Wang, MW Mahoney… - arXiv preprint arXiv …, 2024 - arxiv.org

Recent work on pruning large language models (LLMs) has shown that one can eliminate a
large number of parameters without compromising performance, making pruning a …

被引用次数：3 相关文章所有 3 个版本

[PDF] arxiv.org

Evaluating natural language processing models with generalization metrics that do not need access to any training or testing data

Y Yang, R Theisen, L Hodgkinson, JE Gonzalez… - arXiv preprint arXiv …, 2022 - arxiv.org

Selecting suitable architecture parameters and training hyperparameters is essential for
enhancing machine learning (ML) model performance. Several recent empirical studies …

被引用次数：18 相关文章所有 4 个版本

[PDF] arxiv.org

AlphaLoRA: Assigning LoRA Experts Based on Layer Training Quality

P Qing, C Gao, Y Zhou, X Diao, Y Yang… - arXiv preprint arXiv …, 2024 - arxiv.org

Parameter-efficient fine-tuning methods, such as Low-Rank Adaptation (LoRA), are known
to enhance training efficiency in Large Language Models (LLMs). Due to the limited …

被引用次数：2 相关文章所有 5 个版本

[PDF] arxiv.org

Model balancing helps low-data training and fine-tuning

Z Liu, Y Hu, T Pang, Y Zhou, P Ren, Y Yang - arXiv preprint arXiv …, 2024 - arxiv.org

Recent advances in foundation models have emphasized the need to align pre-trained
models with specialized domains using small, curated datasets. Studies on these foundation …

被引用次数：1 相关文章所有 3 个版本

[PDF] arxiv.org

How many validation labels do you need? exploring the design space of label-efficient model ranking

Z Hu, J Zhang, Y Yu, Y Zhuang, H Xiong - arXiv preprint arXiv:2312.01619, 2023 - arxiv.org

The paper introduces LEMR, a framework that reduces annotation costs for model selection
tasks. Our approach leverages ensemble methods to generate pseudo-labels, employs …

被引用次数：3 相关文章所有 2 个版本

[PDF] researchgate.net

[PDF][PDF] Benchmarking with a Language Model Initial Selection for Text Classification Tasks

A Riyadi, M Kovacs, U Serdült… - Machine Learning and …, 2025 - researchgate.net

The now-globally recognized concerns of AI's environmental implications resulted in a
growing awareness of the need to reduce AI carbon footprints, as well as to carry out AI …

OwLore: Outlier-weighed Layerwise Sampled Low-Rank Projection for Memory-Efficient LLM Fine-tuning

P Li, L Yin, X Gao, S Liu - arXiv preprint arXiv:2405.18380, 2024 - arxiv.org

The rapid advancements in Large Language Models (LLMs) have revolutionized various
natural language processing tasks. However, the substantial size of LLMs presents …

被引用次数：5 相关文章所有 2 个版本

[PDF] arxiv.org

Crafting Heavy-Tails in Weight Matrix Spectrum without Gradient Noise

V Kothapalli, T Pang, S Deng, Z Liu, Y Yang - arXiv preprint arXiv …, 2024 - arxiv.org

Modern training strategies of deep neural networks (NNs) tend to induce a heavy-tailed (HT)
spectra of layer weights. Extensive efforts to study this phenomenon have found that NNs …

被引用次数：3 相关文章所有 2 个版本

高级搜索

QQ 群