Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks J Lu, D Batra, D Parikh, S Lee Advances in neural information processing systems 32, 2019 | 3497 | 2019 |
Graph r-cnn for scene graph generation J Yang, J Lu, S Lee, D Batra, D Parikh Proceedings of the European conference on computer vision (ECCV), 670-685, 2018 | 932 | 2018 |
Embodied question answering A Das, S Datta, G Gkioxari, S Lee, D Parikh, D Batra Proceedings of the IEEE conference on computer vision and pattern …, 2018 | 677 | 2018 |
Counterfactual visual explanations Y Goyal, Z Wu, J Ernst, D Batra, D Parikh, S Lee International Conference on Machine Learning, 2376-2384, 2019 | 544 | 2019 |
Diverse beam search: Decoding diverse solutions from neural sequence models AK Vijayakumar, M Cogswell, RR Selvaraju, Q Sun, S Lee, D Crandall, ... arXiv preprint arXiv:1610.02424, 2016 | 517 | 2016 |
12-in-1: Multi-task vision and language representation learning J Lu, V Goswami, M Rohrbach, D Parikh, S Lee Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2020 | 513 | 2020 |
Lending a hand: Detecting hands and recognizing activities in complex egocentric interactions S Bambach, S Lee, DJ Crandall, C Yu Proceedings of the IEEE international conference on computer vision, 1949-1957, 2015 | 496 | 2015 |
Learning cooperative visual dialog agents with deep reinforcement learning A Das, S Kottur, JMF Moura, S Lee, D Batra Proceedings of the IEEE international conference on computer vision, 2951-2960, 2017 | 458 | 2017 |
Dd-ppo: Learning near-perfect pointgoal navigators from 2.5 billion frames E Wijmans, A Kadian, A Morcos, S Lee, I Essa, D Parikh, M Savva, ... arXiv preprint arXiv:1911.00357, 2019 | 409 | 2019 |
Why m heads are better than one: Training a diverse ensemble of deep networks S Lee, S Purushwalkam, M Cogswell, D Crandall, D Batra arXiv preprint arXiv:1511.06314, 2015 | 313 | 2015 |
Nocaps: Novel object captioning at scale H Agrawal, K Desai, Y Wang, X Chen, R Jain, M Johnson, D Batra, ... Proceedings of the IEEE/CVF international conference on computer vision …, 2019 | 254 | 2019 |
Taking a hint: Leveraging explanations to make vision and language models more grounded RR Selvaraju, S Lee, Y Shen, H Jin, S Ghosh, L Heck, D Batra, D Parikh Proceedings of the IEEE/CVF international conference on computer vision …, 2019 | 252 | 2019 |
Diverse beam search for improved description of complex scenes A Vijayakumar, M Cogswell, R Selvaraju, Q Sun, S Lee, D Crandall, ... Proceedings of the AAAI Conference on Artificial Intelligence 32 (1), 2018 | 237 | 2018 |
Overcoming language priors in visual question answering with adversarial regularization S Ramakrishnan, A Agrawal, S Lee Advances in Neural Information Processing Systems 31, 2018 | 232 | 2018 |
Natural language does not emerge'naturally'in multi-agent dialog S Kottur, JMF Moura, S Lee, D Batra arXiv preprint arXiv:1706.08502, 2017 | 229 | 2017 |
Improving vision-and-language navigation with image-text pairs from the web A Majumdar, A Shrivastava, S Lee, P Anderson, D Parikh, D Batra Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23 …, 2020 | 214 | 2020 |
Beyond the nav-graph: Vision-and-language navigation in continuous environments J Krantz, E Wijmans, A Majumdar, D Batra, S Lee Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23 …, 2020 | 208 | 2020 |
Sim2real predictivity: Does evaluation in simulation predict real-world performance? A Kadian, J Truong, A Gokaslan, A Clegg, E Wijmans, S Lee, M Savva, ... IEEE Robotics and Automation Letters 5 (4), 6670-6677, 2020 | 202 | 2020 |
Stochastic multiple choice learning for training diverse deep ensembles S Lee, S Purushwalkam Shiva Prakash, M Cogswell, V Ranjan, ... Advances in Neural Information Processing Systems 29, 2016 | 201 | 2016 |
Audio visual scene-aware dialog H Alamri, V Cartillier, A Das, J Wang, A Cherian, I Essa, D Batra, TK Marks, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2019 | 180 | 2019 |