Temperature balancing, layer-wise weight analysis, and neural network training

Y Zhou, T Pang, K Liu… - Advances in Neural …, 2024 - proceedings.neurips.cc
Regularization in modern machine learning is crucial, and it can take various forms in
algorithmic design: training set, model family, error function, regularization terms, and …

AI integration in construction safety: Current state, challenges, and future opportunities in text, vision, and audio based applications

ABK Rabbi, I Jeelani - Automation in Construction, 2024 - Elsevier
High occupational injury and fatality rate in the construction industry is a serious global
concern. Recognizing AI as a solution to enhance safety performance, this study reviews …

Alphapruning: Using heavy-tailed self regularization theory for improved layer-wise pruning of large language models

H Lu, Y Zhou, S Liu, Z Wang, MW Mahoney… - arXiv preprint arXiv …, 2024 - arxiv.org
Recent work on pruning large language models (LLMs) has shown that one can eliminate a
large number of parameters without compromising performance, making pruning a …

Evaluating natural language processing models with generalization metrics that do not need access to any training or testing data

Y Yang, R Theisen, L Hodgkinson, JE Gonzalez… - arXiv preprint arXiv …, 2022 - arxiv.org
Selecting suitable architecture parameters and training hyperparameters is essential for
enhancing machine learning (ML) model performance. Several recent empirical studies …

AlphaLoRA: Assigning LoRA Experts Based on Layer Training Quality

P Qing, C Gao, Y Zhou, X Diao, Y Yang… - arXiv preprint arXiv …, 2024 - arxiv.org
Parameter-efficient fine-tuning methods, such as Low-Rank Adaptation (LoRA), are known
to enhance training efficiency in Large Language Models (LLMs). Due to the limited …

Model balancing helps low-data training and fine-tuning

Z Liu, Y Hu, T Pang, Y Zhou, P Ren, Y Yang - arXiv preprint arXiv …, 2024 - arxiv.org
Recent advances in foundation models have emphasized the need to align pre-trained
models with specialized domains using small, curated datasets. Studies on these foundation …

How many validation labels do you need? exploring the design space of label-efficient model ranking

Z Hu, J Zhang, Y Yu, Y Zhuang, H Xiong - arXiv preprint arXiv:2312.01619, 2023 - arxiv.org
The paper introduces LEMR, a framework that reduces annotation costs for model selection
tasks. Our approach leverages ensemble methods to generate pseudo-labels, employs …

[PDF][PDF] Benchmarking with a Language Model Initial Selection for Text Classification Tasks

A Riyadi, M Kovacs, U Serdült… - Machine Learning and …, 2025 - researchgate.net
The now-globally recognized concerns of AI's environmental implications resulted in a
growing awareness of the need to reduce AI carbon footprints, as well as to carry out AI …

OwLore: Outlier-weighed Layerwise Sampled Low-Rank Projection for Memory-Efficient LLM Fine-tuning

P Li, L Yin, X Gao, S Liu - arXiv preprint arXiv:2405.18380, 2024 - arxiv.org
The rapid advancements in Large Language Models (LLMs) have revolutionized various
natural language processing tasks. However, the substantial size of LLMs presents …

Crafting Heavy-Tails in Weight Matrix Spectrum without Gradient Noise

V Kothapalli, T Pang, S Deng, Z Liu, Y Yang - arXiv preprint arXiv …, 2024 - arxiv.org
Modern training strategies of deep neural networks (NNs) tend to induce a heavy-tailed (HT)
spectra of layer weights. Extensive efforts to study this phenomenon have found that NNs …