Invariant Low-Dimensional Subspaces in Gradient Descent for Learning Deep Matrix Factorizations

D Yunis, KK Patel, S Wheeler, P Savarese… - arXiv preprint arXiv …, 2024 - arxiv.org

We propose an empirical approach centered on the spectral dynamics of weights--the
behavior of singular values and vectors during optimization--to unify and clarify several …

被引用次数：1 相关文章所有 4 个版本网页快照

[PDF] sci-hub [PDF] openreview.net [ 下载加速 ]

Enhancing Fine-Tuning Efficiency of LLMs Through Gradient Subspace Tracking

S Rajabi, S Rambhatla - Adaptive Foundation Models: Evolving AI for … - openreview.net

Training and fine-tuning Large Language Models (LLMs) require substantial computational
resources and time due to their large model sizes and optimizer states. To address these …

相关文章所有 2 个版本网页快照

[PDF] sci-hub [PDF] openreview.net [ 下载加速 ]

Optimizing Fine-Tuning Efficiency: Gradient Subspace Tracking on Grassmann Manifolds for Large Language Models

S Rajabi, S Rambhatla - NeurIPS 2024 Workshop on Mathematics of … - openreview.net

Training and fine-tuning Large Language Models (LLMs) demand significant computational
resources and time due to their large model sizes and optimizer states. To mitigate these …

Memory-Efficient Large Language Model (LLM) Training and Fine-Tuning via Gradient Subspace Tracking

S Rajabi, S Rambhatla - OPT 2024: Optimization for Machine Learning - openreview.net

Training and fine-tuning Large Language Models (LLMs) is often highly resource-and time-
intensive due to their large model sizes. To address this issue and improve accessibility …

相关文章所有 2 个版本网页快照

[PDF] sci-hub [PDF] openreview.net [ 下载加速 ]

Rank Minimization, Alignment and Weight Decay in Neural Networks

D Yunis, KK Patel, S Wheeler, PHP Savarese… - … Dynamics 2024: The … - openreview.net

We empirically study the evolution of the singular values and vectors of neural network
weights across a wide variety of practical architectures and domains, including CNNs for …

Accelerating Memory-Efficient LLM Training and Fine-Tuning via Tracking the Gradient Subspace

S Rajabi, S Rambhatla - Workshop on Machine Learning and Compression … - openreview.net

Training and fine-tuning Large Language Models (LLMs) is often highly resource-and time-
intensive due to their large model sizes. To address this issue and improve accessibility …

高级搜索

QQ 群