An image is worth 16x16 words: Transformers for image recognition at scale A Dosovitskiy, L Beyer, A Kolesnikov, D Weissenborn, X Zhai, ... arXiv preprint arXiv:2010.11929, 2020 | 38739 | 2020 |
Parameter-efficient transfer learning for NLP N Houlsby, A Giurgiu, S Jastrzebski, B Morrone, Q De Laroussilhe, ... International conference on machine learning, 2790-2799, 2019 | 3123 | 2019 |
Mlp-mixer: An all-mlp architecture for vision IO Tolstikhin, N Houlsby, A Kolesnikov, L Beyer, X Zhai, T Unterthiner, ... Advances in neural information processing systems 34, 24261-24272, 2021 | 2356 | 2021 |
Scaling vision transformers X Zhai, A Kolesnikov, N Houlsby, L Beyer Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2022 | 1356* | 2022 |
Big transfer (bit): General visual representation learning A Kolesnikov, L Beyer, X Zhai, J Puigcerver, J Yung, S Gelly, N Houlsby Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23 …, 2020 | 1215 | 2020 |
Gemini: a family of highly capable multimodal models G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ... arXiv preprint arXiv:2312.11805, 2023 | 919 | 2023 |
Bayesian active learning for classification and preference learning N Houlsby, F Huszár, Z Ghahramani, M Lengyel arXiv preprint arXiv:1112.5745, 2011 | 824 | 2011 |
Underspecification presents challenges for credibility in modern machine learning A D'Amour, K Heller, D Moldovan, B Adlam, B Alipanahi, A Beutel, ... Journal of Machine Learning Research 23 (226), 1-61, 2022 | 713 | 2022 |
Pali: A jointly-scaled multilingual language-image model X Chen, X Wang, S Changpinyo, AJ Piergiovanni, P Padlewski, D Salz, ... arXiv preprint arXiv:2209.06794, 2022 | 469 | 2022 |
Scaling vision with sparse mixture of experts C Riquelme, J Puigcerver, B Mustafa, M Neumann, R Jenatton, ... Advances in Neural Information Processing Systems 34, 8583-8595, 2021 | 402 | 2021 |
Self-supervised gans via auxiliary rotation loss T Chen, X Zhai, M Ritter, M Lucic, N Houlsby Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2019 | 354 | 2019 |
Scaling vision transformers to 22 billion parameters M Dehghani, J Djolonga, B Mustafa, P Padlewski, J Heek, J Gilmer, ... International Conference on Machine Learning, 7480-7512, 2023 | 329 | 2023 |
Simple open-vocabulary object detection M Minderer, A Gritsenko, A Stone, M Neumann, D Weissenborn, ... European Conference on Computer Vision, 728-755, 2022 | 318 | 2022 |
A large-scale study of representation learning with the visual task adaptation benchmark X Zhai, J Puigcerver, A Kolesnikov, P Ruyssen, C Riquelme, M Lucic, ... arXiv preprint arXiv:1910.04867, 2019 | 290 | 2019 |
Revisiting the calibration of modern neural networks M Minderer, J Djolonga, R Romijnders, F Hubis, X Zhai, N Houlsby, ... Advances in Neural Information Processing Systems 34, 15682-15694, 2021 | 280 | 2021 |
Ul2: Unifying language learning paradigms Y Tay, M Dehghani, VQ Tran, X Garcia, J Wei, X Wang, HW Chung, ... arXiv preprint arXiv:2205.05131, 2022 | 199 | 2022 |
Adaptive Bayesian quantum tomography F Huszár, NMT Houlsby Physical Review A—Atomic, Molecular, and Optical Physics 85 (5), 052120, 2012 | 194 | 2012 |
Probabilistic matrix factorization with non-random missing data JM Hernández-Lobato, N Houlsby, Z Ghahramani International conference on machine learning, 1512-1520, 2014 | 193 | 2014 |
Ask the right questions: Active question reformulation with reinforcement learning C Buck, J Bulian, M Ciaramita, W Gajewski, A Gesmundo, N Houlsby, ... arXiv preprint arXiv:1705.07830, 2017 | 183 | 2017 |
Collaborative gaussian processes for preference learning N Houlsby, F Huszar, Z Ghahramani, J Hernández-lobato Advances in neural information processing systems 25, 2012 | 154 | 2012 |