Random feature attention H Peng, N Pappas, D Yogatama, R Schwartz, NA Smith, L Kong arXiv preprint arXiv:2103.02143, 2021 | 307 | 2021 |
A Dependency Parser for Tweets L Kong, N Schneider, S Swayamdipta, A Bhatia, C Dyer, NA Smith EMNLP 2014, 2014 | 295 | 2014 |
Dynet: The dynamic neural network toolkit G Neubig, C Dyer, Y Goldberg, A Matthews, W Ammar, A Anastasopoulos, ... arXiv preprint arXiv:1701.03980, 2017 | 271 | 2017 |
Diffuseq: Sequence to sequence text generation with diffusion models S Gong, M Li, J Feng, Z Wu, LP Kong arXiv preprint arXiv:2210.08933, 2022 | 204 | 2022 |
Episodic memory in lifelong language learning C de Masson D'Autume, S Ruder, L Kong, D Yogatama Advances in Neural Information Processing Systems 32, 2019 | 202 | 2019 |
cosformer: Rethinking softmax in attention Z Qin, W Sun, H Deng, D Li, Y Wei, B Lv, J Yan, L Kong, Y Zhong arXiv preprint arXiv:2202.08791, 2022 | 179 | 2022 |
Unifiedskg: Unifying and multi-tasking structured knowledge grounding with text-to-text language models T Xie, CH Wu, P Shi, R Zhong, T Scholak, M Yasunaga, CS Wu, M Zhong, ... arXiv preprint arXiv:2201.05966, 2022 | 164 | 2022 |
What do recurrent neural network grammars learn about syntax? A Kuncoro, M Ballesteros, L Kong, C Dyer, G Neubig, NA Smith arXiv preprint arXiv:1611.05774, 2016 | 155 | 2016 |
A contrastive framework for neural text generation Y Su, T Lan, Y Wang, D Yogatama, L Kong, N Collier Advances in Neural Information Processing Systems 35, 21548-21561, 2022 | 137 | 2022 |
Segmental recurrent neural networks L Kong, C Dyer, NA Smith arXiv preprint arXiv:1511.06018, 2015 | 131 | 2015 |
Zerogen: Efficient zero-shot learning via dataset generation J Ye, J Gao, Q Li, H Xu, J Feng, Z Wu, T Yu, L Kong arXiv preprint arXiv:2202.07922, 2022 | 116 | 2022 |
Adaptive semiparametric language models D Yogatama, C de Masson d’Autume, L Kong Transactions of the Association for Computational Linguistics 9, 362-373, 2021 | 102 | 2021 |
Audio–visual segmentation J Zhou, J Wang, J Zhang, W Sun, J Zhang, S Birchfield, D Guo, L Kong, ... European Conference on Computer Vision, 386-403, 2022 | 93 | 2022 |
Distilling an ensemble of greedy dependency parsers into one MST parser A Kuncoro, M Ballesteros, L Kong, C Dyer, NA Smith arXiv preprint arXiv:1609.07561, 2016 | 83 | 2016 |
Segmental recurrent neural networks for end-to-end speech recognition L Lu, L Kong, C Dyer, NA Smith, S Renals arXiv preprint arXiv:1603.00223, 2016 | 77 | 2016 |
Learning and evaluating general linguistic intelligence D Yogatama, CM d'Autume, J Connor, T Kocisky, M Chrzanowski, L Kong, ... arXiv preprint arXiv:1901.11373, 2019 | 70 | 2019 |
A mutual information maximization perspective of language representation learning L Kong, CM d'Autume, W Ling, L Yu, Z Dai, D Yogatama arXiv preprint arXiv:1910.08350, 2019 | 69 | 2019 |
Self-adaptive in-context learning: An information compression perspective for in-context example selection and ordering Z Wu, Y Wang, J Ye, L Kong arXiv preprint arXiv:2212.10375, 2022 | 68 | 2022 |
Language models can see: Plugging visual controls in text generation Y Su, T Lan, Y Liu, F Liu, D Yogatama, Y Wang, L Kong, N Collier arXiv preprint arXiv:2205.02655, 2022 | 66 | 2022 |
Compositional exemplars for in-context learning J Ye, Z Wu, J Feng, T Yu, L Kong International Conference on Machine Learning, 39818-39833, 2023 | 57 | 2023 |