Approaching deep learning through the spectral dynamics of weights
We propose an empirical approach centered on the spectral dynamics of weights--the
behavior of singular values and vectors during optimization--to unify and clarify several …
behavior of singular values and vectors during optimization--to unify and clarify several …
Enhancing Fine-Tuning Efficiency of LLMs Through Gradient Subspace Tracking
S Rajabi, S Rambhatla - Adaptive Foundation Models: Evolving AI for … - openreview.net
Training and fine-tuning Large Language Models (LLMs) require substantial computational
resources and time due to their large model sizes and optimizer states. To address these …
resources and time due to their large model sizes and optimizer states. To address these …
Optimizing Fine-Tuning Efficiency: Gradient Subspace Tracking on Grassmann Manifolds for Large Language Models
S Rajabi, S Rambhatla - NeurIPS 2024 Workshop on Mathematics of … - openreview.net
Training and fine-tuning Large Language Models (LLMs) demand significant computational
resources and time due to their large model sizes and optimizer states. To mitigate these …
resources and time due to their large model sizes and optimizer states. To mitigate these …
Memory-Efficient Large Language Model (LLM) Training and Fine-Tuning via Gradient Subspace Tracking
S Rajabi, S Rambhatla - OPT 2024: Optimization for Machine Learning - openreview.net
Training and fine-tuning Large Language Models (LLMs) is often highly resource-and time-
intensive due to their large model sizes. To address this issue and improve accessibility …
intensive due to their large model sizes. To address this issue and improve accessibility …
Rank Minimization, Alignment and Weight Decay in Neural Networks
We empirically study the evolution of the singular values and vectors of neural network
weights across a wide variety of practical architectures and domains, including CNNs for …
weights across a wide variety of practical architectures and domains, including CNNs for …
Accelerating Memory-Efficient LLM Training and Fine-Tuning via Tracking the Gradient Subspace
S Rajabi, S Rambhatla - Workshop on Machine Learning and Compression … - openreview.net
Training and fine-tuning Large Language Models (LLMs) is often highly resource-and time-
intensive due to their large model sizes. To address this issue and improve accessibility …
intensive due to their large model sizes. To address this issue and improve accessibility …