Benign overfitting in linear classifiers and leaky relu networks from kkt conditions for margin maximization

S Frei, G Vardi, P Bartlett… - The Thirty Sixth Annual …, 2023 - proceedings.mlr.press
Linear classifiers and leaky ReLU networks trained by gradient flow on the logistic loss have
an implicit bias towards solutions which satisfy the Karush–Kuhn–Tucker (KKT) conditions …

The implicit bias of benign overfitting

O Shamir - Conference on Learning Theory, 2022 - proceedings.mlr.press
The phenomenon of benign overfitting, where a predictor perfectly fits noisy training data
while attaining low expected loss, has received much attention in recent years, but still …

What distributions are robust to indiscriminate poisoning attacks for linear learners?

F Suya, X Zhang, Y Tian… - Advances in neural …, 2024 - proceedings.neurips.cc
We study indiscriminate poisoning for linear learners where an adversary injects a few
crafted examples into the training data with the goal of forcing the induced model to incur …

Proxy convexity: A unified framework for the analysis of neural networks trained by gradient descent

S Frei, Q Gu - Advances in Neural Information Processing …, 2021 - proceedings.neurips.cc
Although the optimization objectives for learning neural networks are highly non-convex,
gradient-based methods have been wildly successful at learning neural networks in …

Learning a single neuron with adversarial label noise via gradient descent

I Diakonikolas, V Kontonis… - … on Learning Theory, 2022 - proceedings.mlr.press
We study the fundamental problem of learning a single neuron, ie, a function of the form
$\x\mapsto\sigma (\vec w\cdot\x) $ for monotone activations $\sigma:\R\mapsto\R $, with …

Self-training converts weak learners to strong learners in mixture models

S Frei, D Zou, Z Chen, Q Gu - International Conference on …, 2022 - proceedings.mlr.press
We consider a binary classification problem when the data comes from a mixture of two
rotationally symmetric distributions satisfying concentration and anti-concentration …

Provable generalization of sgd-trained neural networks of any width in the presence of adversarial label noise

S Frei, Y Cao, Q Gu - International Conference on Machine …, 2021 - proceedings.mlr.press
We consider a one-hidden-layer leaky ReLU network of arbitrary width trained by stochastic
gradient descent (SGD) following an arbitrary initialization. We prove that SGD produces …

Agnostic learnability of halfspaces via logistic loss

Z Ji, K Ahn, P Awasthi, S Kale… - … Conference on Machine …, 2022 - proceedings.mlr.press
We investigate approximation guarantees provided by logistic regression for the
fundamental problem of agnostic learning of homogeneous halfspaces. Previously, for a …

When Can Linear Learners be Robust to Indiscriminate Poisoning Attacks?

F Suya, X Zhang, Y Tian, D Evans - arXiv preprint arXiv:2307.01073, 2023 - arxiv.org
We study indiscriminate poisoning for linear learners where an adversary injects a few
crafted examples into the training data with the goal of forcing the induced model to incur …

The implicit bias of benign overfitting

O Shamir - Journal of Machine Learning Research, 2023 - jmlr.org
The phenomenon of benign overfitting, where a predictor perfectly fits noisy training data
while attaining near-optimal expected loss, has received much attention in recent years, but …