Eighty years of the finite element method: Birth, evolution, and future

WK Liu, S Li, HS Park - Archives of Computational Methods in …, 2022 - Springer
This document presents comprehensive historical accounts on the developments of finite
element methods (FEM) since 1941, with a specific emphasis on developments related to …

Recent advances in deep learning theory

F He, D Tao - arXiv preprint arXiv:2012.10931, 2020 - arxiv.org
Deep learning is usually described as an experiment-driven field under continuous criticizes
of lacking theoretical foundations. This problem has been partially fixed by a large volume of …

Modeling image composition for complex scene generation

Z Yang, D Liu, C Wang, J Yang… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
We present a method that achieves state-of-the-art results on challenging (few-shot) layout-
to-image generation tasks by accurately modeling textures, structures and relationships …

Exact solutions of a deep linear network

L Ziyin, B Li, X Meng - Advances in Neural Information …, 2022 - proceedings.neurips.cc
This work finds the analytical expression of the global minima of a deep linear network with
weight decay and stochastic neurons, a fundamental model for understanding the …

Auto learning attention

B Ma, J Zhang, Y Xia, D Tao - Advances in neural …, 2020 - proceedings.neurips.cc
Attention modules have been demonstrated effective in strengthening the representation
ability of a neural network via reweighting spatial or channel features or stacking both …

Understanding deep learning via decision boundary

S Lei, F He, Y Yuan, D Tao - IEEE Transactions on Neural …, 2023 - ieeexplore.ieee.org
This article discovers that the neural network (NN) with lower decision boundary (DB)
variability has better generalizability. Two new notions, algorithm DB variability and-data DB …

Suboptimal local minima exist for wide neural networks with smooth activations

T Ding, D Li, R Sun - Mathematics of Operations Research, 2022 - pubsonline.informs.org
Does a large width eliminate all suboptimal local minima for neural nets? An affirmative
answer was given by a classic result published in 1995 for one-hidden-layer wide neural …

Non-differentiable saddle points and sub-optimal local minima exist for deep ReLU networks

B Liu, Z Liu, T Zhang, T Yuan - Neural Networks, 2021 - Elsevier
Whether sub-optimal local minima and saddle points exist in the highly non-convex loss
landscape of deep neural networks has a great impact on the performance of optimization …

Non-convergence to global minimizers for Adam and stochastic gradient descent optimization and constructions of local minimizers in the training of artificial neural …

A Jentzen, A Riekert - arXiv preprint arXiv:2402.05155, 2024 - arxiv.org
Stochastic gradient descent (SGD) optimization methods such as the plain vanilla SGD
method and the popular Adam optimizer are nowadays the method of choice in the training …

Understanding the loss landscape of one-hidden-layer ReLU networks

B Liu - Knowledge-Based Systems, 2021 - Elsevier
In this paper, it is proved that for one-hidden-layer ReLU networks all differentiable local
minima are global inside each differentiable region. Necessary and sufficient conditions for …