Openflamingo: An open-source framework for training large autoregressive vision-language models A Awadalla, I Gao, J Gardner, J Hessel, Y Hanafy, W Zhu, K Marathe, ... arXiv preprint arXiv:2308.01390, 2023 | 276 | 2023 |
Datacomp: In search of the next generation of multimodal datasets SY Gadre, G Ilharco, A Fang, J Hayase, G Smyrnis, T Nguyen, R Marten, ... Advances in Neural Information Processing Systems 36, 2024 | 196 | 2024 |
Openflamingo A Awadalla, I Gao, J Gardner, J Hessel, Y Hanafy, W Zhu, K Marathe, ... Zenodo, March, 2023 | 42* | 2023 |
Breaking common sense: Whoops! a vision-and-language benchmark of synthetic and compositional images N Bitton-Guetta, Y Bitton, J Hessel, L Schmidt, Y Elovici, G Stanovsky, ... Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 35 | 2023 |
What you see is what you read? improving text-image alignment evaluation M Yarom, Y Bitton, S Changpinyo, R Aharoni, J Herzig, O Lang, E Ofek, ... Advances in Neural Information Processing Systems 36, 2024 | 34 | 2024 |
Visit-bench: A benchmark for vision-language instruction following inspired by real-world use Y Bitton, H Bansal, J Hessel, R Shao, W Zhu, A Awadalla, J Gardner, ... arXiv preprint arXiv:2308.06595, 2023 | 33 | 2023 |
Automatic generation of contrast sets from scene graphs: Probing the compositional consistency of GQA Y Bitton, G Stanovsky, R Schwartz, M Elhadad NAACL 2021, 2021 | 30 | 2021 |
Data efficient masked language modeling for vision and language Y Bitton, G Stanovsky, M Elhadad, R Schwartz EMNLP 2021, Findings, 2021 | 22 | 2021 |
WinoGAViL: Gamified association benchmark to challenge vision-and-language models Y Bitton, NB Guetta, R Yosef, Y Elovici, M Bansal, G Stanovsky, ... NeurIPS 2022, Oral, Datasets and Benchmarks, 2022 | 18 | 2022 |
Irfl: Image recognition of figurative language R Yosef, Y Bitton, D Shahaf arXiv preprint arXiv:2303.15445, 2023 | 13 | 2023 |
VASR: Visual Analogies of Situation Recognition Y Bitton, R Yosef, E Strugo, D Shahaf, R Schwartz, G Stanovsky AAAI 2023 (Oral), 2022 | 11 | 2022 |
Cross-lingual Unified Medical Language System entity linking in online health communities Y Bitton, R Cohen, T Schifter, E Bachmat, M Elhadad, N Elhadad Journal of the American Medical Informatics Association 27 (10), 1585-1592, 2020 | 9 | 2020 |
DOCCI: Descriptions of Connected and Contrasting Images Y Onoe, S Rane, Z Berger, Y Bitton, J Cho, R Garg, A Ku, Z Parekh, ... arXiv preprint arXiv:2404.19753, 2024 | 6 | 2024 |
Videocon: Robust video-language alignment via contrast captions H Bansal, Y Bitton, I Szpektor, KW Chang, A Grover Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 5 | 2024 |
Mismatch quest: Visual and textual feedback for image-text misalignment B Gordon, Y Bitton, Y Shafir, R Garg, X Chen, D Lischinski, D Cohen-Or, ... arXiv preprint arXiv:2312.03766, 2023 | 4 | 2023 |
q2d: Turning questions into dialogs to teach models how to search Y Bitton, S Cohen-Ganor, I Hakimi, Y Lewenberg, R Aharoni, E Weinreb arXiv preprint arXiv:2304.14318, 2023 | 3 | 2023 |
DataComp-LM: In search of the next generation of training sets for language models J Li, A Fang, G Smyrnis, M Ivgi, M Jordan, S Gadre, H Bansal, E Guha, ... arXiv preprint arXiv:2406.11794, 2024 | 2 | 2024 |
Generating Coherent Sequences of Visual Illustrations for Real-World Manual Tasks J Bordalo, V Ramos, R Valério, D Glória-Silva, Y Bitton, M Yarom, ... arXiv preprint arXiv:2405.10122, 2024 | 2 | 2024 |
TALC: Time-Aligned Captions for Multi-Scene Text-to-Video Generation H Bansal, Y Bitton, M Yarom, I Szpektor, A Grover, KW Chang arXiv preprint arXiv:2405.04682, 2024 | 2 | 2024 |
ImageInWords: Unlocking Hyper-Detailed Image Descriptions R Garg, A Burns, BK Ayan, Y Bitton, C Montgomery, Y Onoe, A Bunner, ... arXiv preprint arXiv:2405.02793, 2024 | 2 | 2024 |