Fastspeech 2: Fast and high-quality end-to-end text to speech Y Ren, C Hu, X Tan, T Qin, S Zhao, Z Zhao, TY Liu arXiv preprint arXiv:2006.04558, 2020 | 1260 | 2020 |
Fastspeech: Fast, robust and controllable text to speech Y Ren, Y Ruan, X Tan, T Qin, S Zhao, Z Zhao, TY Liu Advances in neural information processing systems 32, 2019 | 1079 | 2019 |
Self-supervised spatiotemporal learning via video clip order prediction D Xu, J Xiao, Z Zhao, J Shao, D Xie, Y Zhuang Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2019 | 506 | 2019 |
Video question answering via gradually refined attention over appearance and motion D Xu, Z Zhao, J Xiao, F Wu, H Zhang, X He, Y Zhuang Proceedings of the 25th ACM international conference on Multimedia, 1645-1653, 2017 | 485 | 2017 |
Investigating capsule networks with dynamic routing for text classification W Zhao, J Ye, M Yang, Z Lei, S Zhang, Z Zhao arXiv preprint arXiv:1804.00538, 2018 | 484 | 2018 |
Improving automatic source code summarization via deep reinforcement learning Y Wan, Z Zhao, M Yang, G Xu, H Ying, J Wu, PS Yu Proceedings of the 33rd ACM/IEEE international conference on automated …, 2018 | 418 | 2018 |
Pseudo numerical methods for diffusion models on manifolds L Liu, Y Ren, Z Lin, Z Zhao arXiv preprint arXiv:2202.09778, 2022 | 398 | 2022 |
Activitynet-qa: A dataset for understanding complex web videos via question answering Z Yu, D Xu, J Yu, T Yu, Z Zhao, Y Zhuang, D Tao Proceedings of the AAAI Conference on Artificial Intelligence 33 (01), 9127-9134, 2019 | 276 | 2019 |
Multilingual neural machine translation with knowledge distillation X Tan, Y Ren, D He, T Qin, Z Zhao, TY Liu arXiv preprint arXiv:1902.10461, 2019 | 253 | 2019 |
Cross-modal interaction networks for query-based moment retrieval in videos Z Zhang, Z Lin, Z Zhao, Z Xiao Proceedings of the 42nd International ACM SIGIR Conference on Research and …, 2019 | 218 | 2019 |
Diffsinger: Singing voice synthesis via shallow diffusion mechanism J Liu, C Li, Y Ren, F Chen, Z Zhao Proceedings of the AAAI conference on artificial intelligence 36 (10), 11020 …, 2022 | 212 | 2022 |
Expert finding for question answering via graph regularized matrix completion Z Zhao, L Zhang, X He, W Ng IEEE Transactions on Knowledge and Data Engineering 27 (4), 993-1004, 2014 | 194 | 2014 |
Multi-modal attention network learning for semantic source code retrieval Y Wan, J Shu, Y Sui, G Xu, Z Zhao, J Wu, P Yu 2019 34th IEEE/ACM International Conference on Automated Software …, 2019 | 176 | 2019 |
Make-an-audio: Text-to-audio generation with prompt-enhanced diffusion models R Huang, J Huang, D Yang, Y Ren, L Liu, M Li, Z Ye, J Liu, X Yin, Z Zhao International Conference on Machine Learning, 13916-13932, 2023 | 159 | 2023 |
Hierarchical multi-label text classification: An attention-based recurrent network approach W Huang, E Chen, Q Liu, Y Chen, Z Huang, Y Liu, Z Zhao, D Zhang, ... Proceedings of the 28th ACM international conference on information and …, 2019 | 157 | 2019 |
Weakly-supervised video moment retrieval via semantic completion network Z Lin, Z Zhao, Z Zhang, Q Wang, H Liu Proceedings of the AAAI Conference on Artificial Intelligence 34 (07), 11539 …, 2020 | 136 | 2020 |
Fastdiff: A fast conditional diffusion model for high-quality speech synthesis R Huang, MWY Lam, J Wang, D Su, D Yu, Y Ren, Z Zhao arXiv preprint arXiv:2204.09934, 2022 | 132 | 2022 |
Rethinking diversified and discriminative proposal generation for visual grounding Z Yu, J Yu, C Xiang, Z Zhao, Q Tian, D Tao arXiv preprint arXiv:1805.03508, 2018 | 129 | 2018 |
Prodiff: Progressive fast diffusion model for high-quality text-to-speech R Huang, Z Zhao, H Liu, J Liu, C Cui, Y Ren Proceedings of the 30th ACM International Conference on Multimedia, 2595-2605, 2022 | 124 | 2022 |
Multitask learning for cross-domain image captioning M Yang, W Zhao, W Xu, Y Feng, Z Zhao, X Chen, K Lei IEEE Transactions on Multimedia 21 (4), 1047-1061, 2018 | 124 | 2018 |