Training language models to follow instructions with human feedback L Ouyang, J Wu, X Jiang, D Almeida, C Wainwright, P Mishkin, C Zhang, ... Advances in neural information processing systems 35, 27730-27744, 2022 | 7891 | 2022 |
Multi-agent actor-critic for mixed cooperative-competitive environments R Lowe, YI Wu, A Tamar, J Harb, OAI Pieter Abbeel, I Mordatch Advances in neural information processing systems 30, 2017 | 4990 | 2017 |
Gpt-4 technical report J Achiam, S Adler, S Agarwal, L Ahmad, I Akkaya, FL Aleman, D Almeida, ... arXiv preprint arXiv:2303.08774, 2023 | 2652 | 2023 |
How not to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation CW Liu, R Lowe, IV Serban, M Noseworthy, L Charlin, J Pineau arXiv preprint arXiv:1603.08023, 2016 | 1517 | 2016 |
Learning to summarize with human feedback N Stiennon, L Ouyang, J Wu, D Ziegler, R Lowe, C Voss, A Radford, ... Advances in Neural Information Processing Systems 33, 3008-3021, 2020 | 1327 | 2020 |
A hierarchical latent variable encoder-decoder model for generating dialogues I Serban, A Sordoni, R Lowe, L Charlin, J Pineau, A Courville, Y Bengio Proceedings of the AAAI conference on artificial intelligence 31 (1), 2017 | 1289 | 2017 |
The ubuntu dialogue corpus: A large dataset for research in unstructured multi-turn dialogue systems R Lowe, N Pow, I Serban, J Pineau arXiv preprint arXiv:1506.08909, 2015 | 1126 | 2015 |
An actor-critic algorithm for sequence prediction D Bahdanau, P Brakel, K Xu, A Goyal, R Lowe, J Pineau, A Courville, ... arXiv preprint arXiv:1607.07086, 2016 | 710 | 2016 |
A survey of available corpora for building data-driven dialogue systems IV Serban, R Lowe, P Henderson, L Charlin, J Pineau arXiv preprint arXiv:1512.05742, 2015 | 435 | 2015 |
Towards an automatic turing test: Learning to evaluate dialogue responses R Lowe, M Noseworthy, IV Serban, N Angelard-Gontier, Y Bengio, ... arXiv preprint arXiv:1708.07149, 2017 | 418 | 2017 |
The second conversational intelligence challenge (convai2) E Dinan, V Logacheva, V Malykh, A Miller, K Shuster, J Urbanek, D Kiela, ... The NeurIPS'18 Competition: From Machine Learning to Intelligent …, 2020 | 370 | 2020 |
Recursively summarizing books with human feedback J Wu, L Ouyang, DM Ziegler, N Stiennon, R Lowe, J Leike, P Christiano arXiv preprint arXiv:2109.10862, 2021 | 215 | 2021 |
Training language models to follow instructions with human feedback, 2022 L Ouyang, J Wu, X Jiang, D Almeida, CL Wainwright, P Mishkin, C Zhang, ... URL https://arxiv. org/abs/2203.02155 13, 1, 2022 | 190 | 2022 |
Ethical challenges in data-driven dialogue systems P Henderson, K Sinha, N Angelard-Gontier, NR Ke, G Fried, R Lowe, ... Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, 123-129, 2018 | 186 | 2018 |
Training end-to-end dialogue systems with the ubuntu dialogue corpus R Lowe, N Pow, IV Serban, L Charlin, CW Liu, J Pineau Dialogue & Discourse 8 (1), 31-65, 2017 | 184 | 2017 |
On the pitfalls of measuring emergent communication R Lowe, J Foerster, YL Boureau, J Pineau, Y Dauphin arXiv preprint arXiv:1903.05168, 2019 | 140 | 2019 |
Generative deep neural networks for dialogue: A short review IV Serban, R Lowe, L Charlin, J Pineau arXiv preprint arXiv:1611.06216, 2016 | 105 | 2016 |
Learning an unreferenced metric for online dialogue evaluation K Sinha, P Parthasarathi, J Wang, R Lowe, WL Hamilton, J Pineau arXiv preprint arXiv:2005.00583, 2020 | 92 | 2020 |
Training language models to follow instructions with human feedback. arXiv L Ouyang, J Wu, X Jiang, D Almeida, CL Wainwright, P Mishkin, C Zhang, ... arXiv preprint arXiv:2203.02155, 2022 | 85 | 2022 |
On the evaluation of dialogue systems with next utterance classification R Lowe, IV Serban, M Noseworthy, L Charlin, J Pineau arXiv preprint arXiv:1605.05414, 2016 | 78 | 2016 |