Approaching deep learning through the spectral dynamics of weights

D Yunis, KK Patel, S Wheeler, P Savarese… - arXiv preprint arXiv …, 2024 - arxiv.org
We propose an empirical approach centered on the spectral dynamics of weights--the
behavior of singular values and vectors during optimization--to unify and clarify several …

Enhancing Fine-Tuning Efficiency of LLMs Through Gradient Subspace Tracking

S Rajabi, S Rambhatla - Adaptive Foundation Models: Evolving AI for … - openreview.net
Training and fine-tuning Large Language Models (LLMs) require substantial computational
resources and time due to their large model sizes and optimizer states. To address these …

Optimizing Fine-Tuning Efficiency: Gradient Subspace Tracking on Grassmann Manifolds for Large Language Models

S Rajabi, S Rambhatla - NeurIPS 2024 Workshop on Mathematics of … - openreview.net
Training and fine-tuning Large Language Models (LLMs) demand significant computational
resources and time due to their large model sizes and optimizer states. To mitigate these …

Memory-Efficient Large Language Model (LLM) Training and Fine-Tuning via Gradient Subspace Tracking

S Rajabi, S Rambhatla - OPT 2024: Optimization for Machine Learning - openreview.net
Training and fine-tuning Large Language Models (LLMs) is often highly resource-and time-
intensive due to their large model sizes. To address this issue and improve accessibility …

Rank Minimization, Alignment and Weight Decay in Neural Networks

D Yunis, KK Patel, S Wheeler, PHP Savarese… - … Dynamics 2024: The … - openreview.net
We empirically study the evolution of the singular values and vectors of neural network
weights across a wide variety of practical architectures and domains, including CNNs for …

Accelerating Memory-Efficient LLM Training and Fine-Tuning via Tracking the Gradient Subspace

S Rajabi, S Rambhatla - Workshop on Machine Learning and Compression … - openreview.net
Training and fine-tuning Large Language Models (LLMs) is often highly resource-and time-
intensive due to their large model sizes. To address this issue and improve accessibility …