presented in\cite {Bol18} are equivalent in terms of the convergence rates associated with
Stochastic Gradient Descent (SGD) methods if $\epsilon^ 2=\theta^ 2+\nu^ 2$ with specific
choices of $\theta $ and $\nu $. Here, $\epsilon $ controls the relative statistical error of the
norm of the gradient while $\theta $ and $\nu $ control the relative statistical error of the
gradient in the direction of the gradient and in the direction orthogonal to the gradient …