Overview frequency principle/spectral bias in deep learning

ZQJ Xu, Y Zhang, T Luo - Communications on Applied Mathematics and …, 2024 - Springer
Understanding deep learning is increasingly emergent as it penetrates more and more into
industry and science. In recent years, a research line from Fourier analysis sheds light on …

Embedding principle: a hierarchical structure of loss landscape of deep neural networks

Y Zhang, Y Li, Z Zhang, T Luo, ZQJ Xu - arXiv preprint arXiv:2111.15527, 2021 - arxiv.org
We prove a general Embedding Principle of loss landscape of deep neural networks (NNs)
that unravels a hierarchical structure of the loss landscape of NNs, ie, loss landscape of an …

Adaptive multi-scale neural network with resnet blocks for solving partial differential equations

M Chen, R Niu, W Zheng - Nonlinear Dynamics, 2023 - Springer
In this paper, an adaptive multi-scale neural network with Resnet blocks (adaptive-MS-
Resnet) architecture is constructed for solving the Poisson equation, Helmholtz equation …

Implicit regularization of dropout

Z Zhang, ZQJ Xu - IEEE Transactions on Pattern Analysis and …, 2024 - ieeexplore.ieee.org
It is important to understand how dropout, a popular regularization method, aids in achieving
a good generalization solution during neural network training. In this work, we present a …

Machine-learning-based identification for initial clustering structure in relativistic heavy-ion collisions

J He, WB He, YG Ma, S Zhang - Physical Review C, 2021 - APS
α-clustering structure is a significant topic in light nuclei. A Bayesian convolutional neural
network (BCNN) is applied to classify initial nonclustered and clustered configurations …

Subspace decomposition based DNN algorithm for elliptic type multi-scale PDEs

XA Li, ZQJ Xu, L Zhang - Journal of Computational Physics, 2023 - Elsevier
While deep learning algorithms demonstrate a great potential in scientific computing, its
application to multi-scale problems remains to be a big challenge. This is manifested by the …

Loss spike in training neural networks

X Li, ZQJ Xu, Z Zhang - arXiv preprint arXiv:2305.12133, 2023 - arxiv.org
In this work, we investigate the mechanism underlying loss spikes observed during neural
network training. When the training enters a region with a lower-loss-as-sharper (LLAS) …

Anchor function: a type of benchmark functions for studying language models

Z Zhang, Z Wang, J Yao, Z Zhou, X Li, ZQJ Xu - arXiv preprint arXiv …, 2024 - arxiv.org
Understanding transformer-based language models is becoming increasingly crucial,
particularly as they play pivotal roles in advancing towards artificial general intelligence …

Effective dynamics of generative adversarial networks

S Durr, Y Mroueh, Y Tu, S Wang - Physical Review X, 2023 - APS
Generative adversarial networks (GANs) are a class of machine-learning models that use
adversarial training to generate new samples with the same (potentially very complex) …

A non-gradient method for solving elliptic partial differential equations with deep neural networks

Y Peng, D Hu, ZQJ Xu - Journal of Computational Physics, 2023 - Elsevier
Deep learning has achieved wide success in solving Partial Differential Equations (PDEs),
with particular strength in handling high dimensional problems and parametric problems …