ODE transformer: An ordinary differential equation-inspired model for sequence generation B Li, Q Du, T Zhou, Y Jing, S Zhou, X Zeng, T Xiao, JB Zhu, X Liu, ... arXiv preprint arXiv:2203.09176, 2022 | 22 | 2022 |
Learning multiscale transformer models for sequence generation B Li, T Zheng, Y Jing, C Jiao, T Xiao, J Zhu International Conference on Machine Learning, 13225-13241, 2022 | 15 | 2022 |
The NiuTrans machine translation systems for WMT21 S Zhou, T Zhou, B Wei, Y Luo, Y Mu, Z Zhou, C Wang, X Zhou, C Lv, ... arXiv preprint arXiv:2109.10485, 2021 | 5 | 2021 |