With a goal of understanding what drives generalization in deep networks, we consider several recently suggested explanations, including norm-based control, sharpness and …
Current methods for training robust networks lead to a drop in test accuracy, which has led prior works to posit that a robustness-accuracy tradeoff may be inevitable in deep learning …
A Virmaux, K Scaman - Advances in Neural Information …, 2018 - proceedings.neurips.cc
Deep neural networks are notorious for being sensitive to small well-chosen perturbations, and estimating the regularity of such architectures is of utmost importance for safe and …
F Bach - Journal of Machine Learning Research, 2017 - jmlr.org
We consider neural networks with a single hidden layer and non-decreasing positively homogeneous activation functions like the rectified linear units. By letting the number of …
M Zinkevich, M Weimer, L Li… - Advances in neural …, 2010 - proceedings.neurips.cc
With the increase in available data parallel machine learning has become an increasingly pressing problem. In this paper we present the first parallel stochastic gradient descent …
O Chapelle, B Scholkopf, A Zien - IEEE Transactions on Neural …, 2009 - ieeexplore.ieee.org
This book addresses some theoretical aspects of semisupervised learning (SSL). The book is organized as a collection of different contributions of authors who are experts on this topic …
O Pele, M Werman - 2009 IEEE 12th international conference …, 2009 - ieeexplore.ieee.org
We present a new algorithm for a robust family of Earth Mover's Distances-EMDs with thresholded ground distances. The algorithm transforms the flow-network of the EMD so that …
Given two probability measures, P and Q defined on a measurable space, S, the integral probability metric (IPM) is defined as F (P, Q)=\sup\left {\left | S f\, d PS f\, d Q\right |\,:\, f ∈ …
R Gao - Operations Research, 2023 - pubsonline.informs.org
Wasserstein distributionally robust optimization (DRO) aims to find robust and generalizable solutions by hedging against data perturbations in Wasserstein distance. Despite its recent …