关注
Zhongwang Zhang
Zhongwang Zhang
在 sjtu.edu.cn 的电子邮件经过验证
标题
引用次数
引用次数
年份
Embedding principle of loss landscape of deep neural networks
Y Zhang, Z Zhang, T Luo, ZJ Xu
Advances in Neural Information Processing Systems 34, 14848-14859, 2021
322021
Embedding principle: a hierarchical structure of loss landscape of deep neural networks
Y Zhang, Y Li, Z Zhang, T Luo, ZQJ Xu
Journal of Machine Learning, 2021
272021
Implicit regularization of dropout
Z Zhang, ZQJ Xu
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024
16*2024
Linear stability hypothesis and rank stratification for nonlinear models
Y Zhang, Z Zhang, L Zhang, Z Bai, T Luo, ZQJ Xu
arXiv preprint arXiv:2211.11623, 2022
52022
Optimistic estimate uncovers the potential of nonlinear models
Y Zhang, Z Zhang, L Zhang, Z Bai, T Luo, ZQJ Xu
arXiv preprint arXiv:2307.08921, 2023
42023
Anchor function: a type of benchmark functions for studying language models
Z Zhang, Z Wang, J Yao, Z Zhou, X Li, ZQJ Xu
arXiv preprint arXiv:2401.08309, 2024
32024
Stochastic modified equations and dynamics of dropout algorithm
Z Zhang, Y Li, T Luo, ZQJ Xu
ICLR 2024, 2023
22023
Towards understanding how transformer perform multi-step reasoning with matching operation
Z Wang, Y Wang, Z Zhang, Z Zhou, H Jin, T Hu, J Sun, Z Li, Y Zhang, ...
arXiv preprint arXiv:2405.15302, 2024
12024
Initialization is Critical to Whether Transformers Fit Composite Functions by Inference or Memorizing
Z Zhang, P Lin, Z Wang, Y Zhang, ZQJ Xu
arXiv preprint arXiv:2405.05409, 2024
12024
Loss Spike in Training Neural Networks
Z Zhang, ZQJ Xu
arXiv preprint arXiv:2305.12133, 2023
12023
Loss Jump During Loss Switch in Solving PDEs with Neural Networks
Z Wang, L Zhang, Z Zhang, ZQJ Xu
arXiv preprint arXiv:2405.03095, 2024
2024
系统目前无法执行此操作,请稍后再试。
文章 1–11