SK Kumar - arXiv preprint arXiv:2401.08426, 2024 - arxiv.org
This paper investigates how non-differentiability affects three different aspects of the neural network training process. We first analyze fully connected neural networks with ReLU …
We introduce a new algorithm, extended regularized dual averaging (XRDA), for solving regularized stochastic optimization problems, which generalizes the regularized dual …
In this dissertation, we first propose the xRDA algorithm with an adaptively weighted $\ell^ 1$-regularization scheme and momentum for training sparse neural networks. Then we …
Network pruning is a widely used technique for effectively compressing Deep Neural Networks with little to no degradation in performance during inference. Iterative Magnitude …