Xlnet: Generalized autoregressive pretraining for language understanding Z Yang, Z Dai, Y Yang, J Carbonell, RR Salakhutdinov, QV Le Advances in neural information processing systems 32, 2019 | 10642* | 2019 |
Transformer-xl: Attentive language models beyond a fixed-length context Z Dai, Z Yang, Y Yang, J Carbonell, QV Le, R Salakhutdinov arXiv preprint arXiv:1901.02860, 2019 | 4110 | 2019 |
Revisiting semi-supervised learning with graph embeddings Z Yang, W Cohen, R Salakhudinov International conference on machine learning, 40-48, 2016 | 2161 | 2016 |
HotpotQA: A dataset for diverse, explainable multi-hop question answering Z Yang, P Qi, S Zhang, Y Bengio, WW Cohen, R Salakhutdinov, ... arXiv preprint arXiv:1809.09600, 2018 | 1835 | 2018 |
GPT understands, too X Liu, Y Zheng, Z Du, M Ding, Y Qian, Z Yang, J Tang AI Open, 2023 | 1166* | 2023 |
Glm: General language model pretraining with autoregressive blank infilling Z Du, Y Qian, X Liu, M Ding, J Qiu, Z Yang, J Tang arXiv preprint arXiv:2103.10360, 2021 | 924 | 2021 |
P-tuning v2: Prompt tuning can be comparable to fine-tuning universally across scales and tasks X Liu, K Ji, Y Fu, WL Tam, Z Du, Z Yang, J Tang arXiv preprint arXiv:2110.07602, 2021 | 917 | 2021 |
Differentiable learning of logical rules for knowledge base reasoning F Yang, Z Yang, WW Cohen Advances in neural information processing systems 30, 2017 | 665 | 2017 |
Multi-task cross-lingual sequence tagging from scratch Z Yang, R Salakhutdinov, W Cohen arXiv preprint arXiv:1603.06270, 2016 | 637* | 2016 |
Good semi-supervised learning that requires a bad gan Z Dai, Z Yang, F Yang, WW Cohen, RR Salakhutdinov Advances in neural information processing systems 30, 2017 | 561 | 2017 |
Gated-Attention Readers for Text Comprehension B Dhingra, H Liu, Z Yang, WW Cohen, R Salakhutdinov arXiv preprint arXiv:1606.01549, 2016 | 456 | 2016 |
Breaking the softmax bottleneck: A high-rank RNN language model Z Yang, Z Dai, R Salakhutdinov, WW Cohen arXiv preprint arXiv:1711.03953, 2017 | 403 | 2017 |
Review networks for caption generation Z Yang, Y Yuan, Y Wu, WW Cohen, RR Salakhutdinov Advances in neural information processing systems 29, 2016 | 396* | 2016 |
Cosnet: Connecting heterogeneous social networks with local and global consistency Y Zhang, J Tang, Z Yang, J Pei, PS Yu Proceedings of the 21th ACM SIGKDD international conference on knowledge …, 2015 | 351 | 2015 |
Neural cross-lingual named entity recognition with minimal resources J Xie, Z Yang, G Neubig, NA Smith, J Carbonell arXiv preprint arXiv:1808.09861, 2018 | 204 | 2018 |
Semi-supervised QA with generative domain-adaptive nets Z Yang, J Hu, R Salakhutdinov, WW Cohen arXiv preprint arXiv:1702.02206, 2017 | 185 | 2017 |
Codegeex: A pre-trained model for code generation with multilingual evaluations on humaneval-x Q Zheng, X Xia, X Zou, Y Dong, S Wang, Y Xue, Z Wang, L Shen, A Wang, ... arXiv preprint arXiv:2303.17568, 2023 | 151 | 2023 |
Linguistic knowledge as memory for recurrent neural networks B Dhingra, Z Yang, WW Cohen, R Salakhutdinov arXiv preprint arXiv:1703.02620, 2017 | 150* | 2017 |
Words or characters? fine-grained gating for reading comprehension Z Yang, B Dhingra, Y Yuan, J Hu, WW Cohen, R Salakhutdinov arXiv preprint arXiv:1611.01724, 2016 | 97 | 2016 |
Flipda: Effective and robust data augmentation for few-shot learning J Zhou, Y Zheng, J Tang, J Li, Z Yang arXiv preprint arXiv:2108.06332, 2021 | 68 | 2021 |