Learning ReLU networks on linearly separable data: Algorithm, optimality, and generalization

G Wang, GB Giannakis, J Chen - IEEE Transactions on Signal …, 2019 - ieeexplore.ieee.org
Neural networks with rectified linear unit (ReLU) activation functions (aka ReLU networks)
have achieved great empirical success in various domains. Nonetheless, existing results for …

Transformers from an optimization perspective

Y Yang, DP Wipf - Advances in Neural Information …, 2022 - proceedings.neurips.cc
Deep learning models such as the Transformer are often constructed by heuristics and
experience. To provide a complementary foundation, in this work we study the following …

Alternating minimizations converge to second-order optimal solutions

Q Li, Z Zhu, G Tang - International Conference on Machine …, 2019 - proceedings.mlr.press
This work studies the second-order convergence for both standard alternating minimization
and proximal alternating minimization. We show that under mild assumptions on the …

Private alternating least squares: Practical private matrix completion with tighter rates

S Chien, P Jain, W Krichene, S Rendle… - International …, 2021 - proceedings.mlr.press
We study the problem of differentially private (DP) matrix completion under user-level
privacy. We design a joint differentially private variant of the popular Alternating-Least …

Learning with logical constraints but without shortcut satisfaction

Z Li, Z Liu, Y Yao, J Xu, T Chen, X Ma, J Lü - arXiv preprint arXiv …, 2024 - arxiv.org
Recent studies in neuro-symbolic learning have explored the integration of logical
knowledge into deep learning via encoding logical constraints as an additional loss function …

L2T-DLN: learning to teach with dynamic loss network

Z Hai, L Pan, X Liu, Z Liu… - Advances in Neural …, 2024 - proceedings.neurips.cc
With the concept of teaching being introduced to the machine learning community, a teacher
model start using dynamic loss functions to teach the training of a student model. The …

Hierarchical fuzzy neural networks with privacy preservation for heterogeneous big data

L Zhang, Y Shi, YC Chang… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
Heterogeneous big data poses many challenges in machine learning. Its enormous scale,
high dimensionality, and inherent uncertainty make almost every aspect of machine learning …

Finding second-order stationary points efficiently in smooth nonconvex linearly constrained optimization problems

S Lu, M Razaviyayn, B Yang… - Advances in Neural …, 2020 - proceedings.neurips.cc
This paper proposes two efficient algorithms for computing approximate second-order
stationary points (SOSPs) of problems with generic smooth non-convex objective functions …

MA2QL: A minimalist approach to fully decentralized multi-agent reinforcement learning

K Su, S Zhou, J Jiang, C Gan, X Wang, Z Lu - arXiv preprint arXiv …, 2022 - arxiv.org
Decentralized learning has shown great promise for cooperative multi-agent reinforcement
learning (MARL). However, non-stationarity remains a significant challenge in fully …

Linearized ADMM converges to second-order stationary points for non-convex problems

S Lu, JD Lee, M Razaviyayn… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
In this work, a gradient-based primal-dual method of multipliers is proposed for solving a
class of linearly constrained non-convex problems. We show that with random initialization …