关注
Zhihua Wu
Zhihua Wu
在 baidu.com 的电子邮件经过验证
标题
引用次数
引用次数
年份
Ernie 3.0: Large-scale knowledge enhanced pre-training for language understanding and generation
Y Sun, S Wang, S Feng, S Ding, C Pang, J Shang, J Liu, X Chen, Y Zhao, ...
arXiv preprint arXiv:2107.02137, 2021
3822021
Ernie 3.0 titan: Exploring larger-scale knowledge enhanced pre-training for language understanding and generation
S Wang, Y Sun, Y Xiang, Z Wu, S Ding, W Gong, S Feng, J Shang, Y Zhao, ...
arXiv preprint arXiv:2112.12731, 2021
712021
Plato-xl: Exploring the large-scale pre-training of dialogue generation
S Bao, H He, F Wang, H Wu, H Wang, W Wu, Z Wu, Z Guo, H Lu, X Huang, ...
arXiv preprint arXiv:2109.09519, 2021
582021
Ascnet: Self-supervised video representation learning with appearance-speed consistency
D Huang, W Wu, W Hu, X Liu, D He, Z Wu, X Wu, M Tan, E Ding
Proceedings of the IEEE/CVF international conference on computer vision …, 2021
472021
Ernie-vilg: Unified generative pre-training for bidirectional vision-language generation
H Zhang, W Yin, Y Fang, L Li, B Duan, Z Wu, Y Sun, H Tian, H Wu, ...
arXiv preprint arXiv:2112.15283, 2021
462021
Heterps: Distributed deep learning with reinforcement learning based scheduling in heterogeneous environments
J Liu, Z Wu, D Feng, M Zhang, X Wu, X Yao, D Yu, Y Ma, F Zhao, D Dou
Future Generation Computer Systems 148, 106-117, 2023
322023
Ernie 3.0: Large-scale knowledge enhanced pre-training for language understanding and generation. arXiv 2021
Y Sun, S Wang, S Feng, S Ding, C Pang, J Shang, J Liu, X Chen, Y Zhao, ...
arXiv preprint arXiv:2107.02137, 2021
262021
Se-moe: A scalable and efficient mixture-of-experts distributed training and inference system
L Shen, Z Wu, WB Gong, H Hao, Y Bai, HC Wu, X Wu, J Bian, H Xiong, ...
arXiv preprint arXiv:2205.10034, 2022
242022
Helixfold: An efficient implementation of alphafold2 using paddlepaddle
G Wang, X Fang, Z Wu, Y Liu, Y Xue, Y Xiang, D Yu, F Wang, Y Ma
arXiv preprint arXiv:2207.05477, 2022
232022
Ta-moe: Topology-aware large scale mixture-of-expert training
C Chen, M Li, Z Wu, D Yu, C Yang
Advances in Neural Information Processing Systems 35, 22173-22186, 2022
92022
Boosting distributed training performance of the unpadded bert model
J Zeng, M Li, Z Wu, J Liu, Y Liu, D Yu, Y Ma
arXiv preprint arXiv:2208.08124, 2022
92022
PipePar: Enabling fast DNN pipeline parallel training in heterogeneous GPU clusters
J Zhang, G Niu, Q Dai, H Li, Z Wu, F Dong, Z Wu
Neurocomputing 555, 126661, 2023
72023
Nebula-I: A general framework for collaboratively training deep learning models on low-bandwidth cloud clusters
Y Xiang, Z Wu, W Gong, S Ding, X Mo, Y Liu, S Wang, P Liu, Y Hou, L Li, ...
arXiv preprint arXiv:2205.09470, 2022
62022
End-to-end adaptive distributed training on paddlepaddle
Y Ao, Z Wu, D Yu, W Gong, Z Kui, M Zhang, Z Ye, L Shen, Y Ma, T Wu, ...
arXiv preprint arXiv:2112.02752, 2021
62021
Addressing Heterogeneity in Federated Learning with Client Selection via Submodular Optimization
J Zhang, J Wang, Y Li, F Xin, F Dong, J Luo, Z Wu
ACM Transactions on Sensor Networks 20 (2), 1-32, 2024
22024
Recyclegpt: An autoregressive language model with recyclable module
Y Jiang, Q He, X Zhuang, Z Wu, K Wang, W Zhao, G Yang
arXiv preprint arXiv:2308.03421, 2023
22023
Graph4Rec: a universal toolkit with graph neural networks for recommender systems
W Li, M He, Z Huang, X Wang, S Feng, W Su, Y Sun
arXiv preprint arXiv:2112.01035, 2021
22021
Efficient AlphaFold2 Training using Parallel Evoformer and Branch Parallelism
G Wang, Z Wu, X Fang, Y Xiang, Y Liu, D Yu, Y Ma
arXiv preprint arXiv:2211.00235, 2022
12022
Method and apparatus of processing information, method and apparatus of recommending information, electronic device, and storage medium
M Cheng, YU Dianhai, L Ma, Z Wu, D Daxiang, W Tang
US Patent App. 17/517,703, 2022
12022
Method for distributed training model, relevant apparatus, and computer readable storage medium
X Wu, X Yao, YU Dianhai, Z Wu, Y Ma, T Wu, H Wang
US Patent App. 17/362,674, 2021
12021
系统目前无法执行此操作,请稍后再试。
文章 1–20