Transformers in medical imaging: A survey

F Shamshad, S Khan, SW Zamir, MH Khan… - Medical Image …, 2023 - Elsevier
Following unprecedented success on the natural language tasks, Transformers have been
successfully applied to several computer vision problems, achieving state-of-the-art results …

A survey on deep learning-based monocular spacecraft pose estimation: Current state, limitations and prospects

L Pauly, W Rharbaoui, C Shneider, A Rathinam… - Acta Astronautica, 2023 - Elsevier
Estimating the pose of an uncooperative spacecraft is an important computer vision problem
for enabling the deployment of automatic vision-based systems in orbit, with applications …

ClimaX: A foundation model for weather and climate

T Nguyen, J Brandstetter, A Kapoor, JK Gupta… - arXiv preprint arXiv …, 2023 - arxiv.org
Most state-of-the-art approaches for weather and climate modeling are based on physics-
informed numerical models of the atmosphere. These approaches aim to model the non …

Are transformers more robust than cnns?

Y Bai, J Mei, AL Yuille, C Xie - Advances in neural …, 2021 - proceedings.neurips.cc
Transformer emerges as a powerful tool for visual recognition. In addition to demonstrating
competitive performance on a broad range of visual benchmarks, recent works also argue …

Video transformers: A survey

J Selva, AS Johansen, S Escalera… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org
Transformer models have shown great success handling long-range interactions, making
them a promising tool for modeling video. However, they lack inductive biases and scale …

Part-aware transformer for generalizable person re-identification

H Ni, Y Li, L Gao, HT Shen… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Abstract Domain generalization person re-identification (DG ReID) aims to train a model on
source domains and generalize well on unseen domains. Vision Transformer usually yields …

A survey of the vision transformers and their CNN-transformer based variants

A Khan, Z Rauf, A Sohail, AR Khan, H Asif… - Artificial Intelligence …, 2023 - Springer
Vision transformers have become popular as a possible substitute to convolutional neural
networks (CNNs) for a variety of computer vision applications. These transformers, with their …

Delving into masked autoencoders for multi-label thorax disease classification

J Xiao, Y Bai, A Yuille, Z Zhou - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Abstract Vision Transformer (ViT) has become one of the most popular neural architectures
due to its simplicity, scalability, and compelling performance in multiple vision tasks …

An impartial take to the cnn vs transformer robustness contest

F Pinto, PHS Torr, P K. Dokania - European Conference on Computer …, 2022 - Springer
Following the surge of popularity of Transformers in Computer Vision, several studies have
attempted to determine whether they could be more robust to distribution shifts and provide …

A closer look at the robustness of contrastive language-image pre-training (clip)

W Tu, W Deng, T Gedeon - Advances in Neural Information …, 2024 - proceedings.neurips.cc
Abstract Contrastive Language-Image Pre-training (CLIP) models have demonstrated
remarkable generalization capabilities across multiple challenging distribution shifts …