12-in-1: Multi-task vision and language representation learning

J Lu, V Goswami, M Rohrbach… - Proceedings of the …, 2020 - openaccess.thecvf.com
Much of vision-and-language research focuses on a small but diverse set of independent
tasks and supporting datasets often studied in isolation; however, the visually-grounded …

[PDF][PDF] 12-in-1: Multi-Task Vision and Language Representation Learning

J Lu, V Goswami, M Rohrbach, D Parikh, S Lee - openaccess.thecvf.com
Much of vision-and-language research focuses on a small but diverse set of independent
tasks and supporting datasets often studied in isolation; however, the visuallygrounded …

12-in-1: Multi-Task Vision and Language Representation Learning

J Lu, V Goswami, M Rohrbach, D Parikh… - 2020 IEEE/CVF …, 2020 - computer.org
Much of vision-and-language research focuses on a small but diverse set of independent
tasks and supporting datasets often studied in isolation; however, the visually-grounded …

12-in-1: Multi-Task Vision and Language Representation Learning

J Lu, V Goswami, M Rohrbach, D Parikh… - arXiv e …, 2019 - ui.adsabs.harvard.edu
Much of vision-and-language research focuses on a small but diverse set of independent
tasks and supporting datasets often studied in isolation; however, the visually-grounded …

12-in-1: Multi-Task Vision and Language Representation Learning

J Lu, V Goswami, M Rohrbach… - 2020 IEEE/CVF …, 2020 - ieeexplore.ieee.org
Much of vision-and-language research focuses on a small but diverse set of independent
tasks and supporting datasets often studied in isolation; however, the visually-grounded …

[PDF][PDF] 12-in-1: Multi-Task Vision and Language Representation Learning

J Lu, V Goswami, M Rohrbach, D Parikh, S Lee - scholar.archive.org
Much of vision-and-language research focuses on a small but diverse set of independent
tasks and supporting datasets often studied in isolation; however, the visuallygrounded …

12-in-1: Multi-Task Vision and Language Representation Learning

J Lu, V Goswami, M Rohrbach, D Parikh… - arXiv preprint arXiv …, 2019 - arxiv.org
Much of vision-and-language research focuses on a small but diverse set of independent
tasks and supporting datasets often studied in isolation; however, the visually-grounded …