A linear frequency principle model to understand the absence of overfitting in neural networks

ZQJ Xu, Y Zhang, T Luo - Communications on Applied Mathematics and …, 2024 - Springer

Understanding deep learning is increasingly emergent as it penetrates more and more into
industry and science. In recent years, a research line from Fourier analysis sheds light on …

被引用次数：67 相关文章所有 3 个版本

[PDF] arxiv.org

Embedding principle: a hierarchical structure of loss landscape of deep neural networks

Y Zhang, Y Li, Z Zhang, T Luo, ZQJ Xu - arXiv preprint arXiv:2111.15527, 2021 - arxiv.org

We prove a general Embedding Principle of loss landscape of deep neural networks (NNs)
that unravels a hierarchical structure of the loss landscape of NNs, ie, loss landscape of an …

被引用次数：30 相关文章所有 7 个版本

Adaptive multi-scale neural network with resnet blocks for solving partial differential equations

M Chen, R Niu, W Zheng - Nonlinear Dynamics, 2023 - Springer

In this paper, an adaptive multi-scale neural network with Resnet blocks (adaptive-MS-
Resnet) architecture is constructed for solving the Poisson equation, Helmholtz equation …

被引用次数：23 相关文章所有 3 个版本

[PDF] arxiv.org

Implicit regularization of dropout

Z Zhang, ZQJ Xu - IEEE Transactions on Pattern Analysis and …, 2024 - ieeexplore.ieee.org

It is important to understand how dropout, a popular regularization method, aids in achieving
a good generalization solution during neural network training. In this work, we present a …

被引用次数：21 相关文章所有 11 个版本

[PDF] aps.org

Machine-learning-based identification for initial clustering structure in relativistic heavy-ion collisions

J He, WB He, YG Ma, S Zhang - Physical Review C, 2021 - APS

α-clustering structure is a significant topic in light nuclei. A Bayesian convolutional neural
network (BCNN) is applied to classify initial nonclustered and clustered configurations …

被引用次数：29 相关文章所有 5 个版本

[PDF] arxiv.org

Subspace decomposition based DNN algorithm for elliptic type multi-scale PDEs

XA Li, ZQJ Xu, L Zhang - Journal of Computational Physics, 2023 - Elsevier

While deep learning algorithms demonstrate a great potential in scientific computing, its
application to multi-scale problems remains to be a big challenge. This is manifested by the …

被引用次数：25 相关文章所有 7 个版本

[PDF] arxiv.org

Loss spike in training neural networks

X Li, ZQJ Xu, Z Zhang - arXiv preprint arXiv:2305.12133, 2023 - arxiv.org

In this work, we investigate the mechanism underlying loss spikes observed during neural
network training. When the training enters a region with a lower-loss-as-sharper (LLAS) …

被引用次数：7 相关文章所有 5 个版本

[PDF] arxiv.org

Anchor function: a type of benchmark functions for studying language models

Z Zhang, Z Wang, J Yao, Z Zhou, X Li, ZQJ Xu - arXiv preprint arXiv …, 2024 - arxiv.org

Understanding transformer-based language models is becoming increasingly crucial,
particularly as they play pivotal roles in advancing towards artificial general intelligence …

被引用次数：4 相关文章所有 3 个版本

[PDF] aps.org

Effective dynamics of generative adversarial networks

S Durr, Y Mroueh, Y Tu, S Wang - Physical Review X, 2023 - APS

Generative adversarial networks (GANs) are a class of machine-learning models that use
adversarial training to generate new samples with the same (potentially very complex) …

被引用次数：8 相关文章所有 4 个版本

[PDF] ssrn.com

A non-gradient method for solving elliptic partial differential equations with deep neural networks

Y Peng, D Hu, ZQJ Xu - Journal of Computational Physics, 2023 - Elsevier

Deep learning has achieved wide success in solving Partial Differential Equations (PDEs),
with particular strength in handling high dimensional problems and parametric problems …

被引用次数：8 相关文章所有 5 个版本

高级搜索

QQ 群