The complex nature of combining localization and classification in object detection has resulted in the flourished development of methods. Previous works tried to improve the …
This paper presents a new vision Transformer, called Swin Transformer, that capably serves as a general-purpose backbone for computer vision. Challenges in adapting Transformer …
Although convolutional neural networks (CNNs) have achieved great success in computer vision, this work investigates a simpler, convolution-free backbone network useful for many …
Recently, Vision Transformer and its variants have shown great promise on various computer vision tasks. The ability of capturing short-and long-range visual dependencies …
In this paper, we present a novel Dynamic DETR (Detection with Transformers) approach by introducing dynamic attentions into both the encoder and decoder stages of DETR to break …
We present DetCo, a simple yet effective self-supervised approach for object detection. Unsupervised pre-training methods have been recently designed for object detection, but …
Y Fang, S Yang, X Wang, Y Li, C Fang… - Proceedings of the …, 2021 - openaccess.thecvf.com
We present QueryInst, a new perspective for instance segmentation. QueryInst is a multi- stage end-to-end system that treats instances of interest as learnable queries, enabling …
Multi-object tracking (MOT) is an important problem in computer vision which has a wide range of applications. Formulating MOT as multi-task learning of object detection and re-ID …
This work presents a simple vision transformer design as a strong baseline for object localization and instance segmentation tasks. Transformers recently demonstrate …