Instaboost: Boosting instance segmentation via probability map guided copy-pasting

W Gu, S Bai, L Kong - Image and Vision Computing, 2022 - Elsevier

Image instance segmentation involves labeling pixels of images with classes and instances,
which is one of the pivotal technologies in many domains, such as natural scenes …

被引用次数：122 相关文章所有 2 个版本

[PDF] arxiv.org

Making images real again: A comprehensive survey on deep image composition

L Niu, W Cong, L Liu, Y Hong, B Zhang, J Liang… - arXiv preprint arXiv …, 2021 - arxiv.org

As a common image editing operation, image composition aims to combine the foreground
from one image and another background image, resulting in a composite image. However …

被引用次数：74 相关文章所有 2 个版本

[PDF] arxiv.org

Vision transformer adapter for dense predictions

Z Chen, Y Duan, W Wang, J He, T Lu, J Dai… - arXiv preprint arXiv …, 2022 - arxiv.org

This work investigates a simple yet powerful adapter for Vision Transformer (ViT). Unlike
recent visual transformers that introduce vision-specific inductive biases into their …

被引用次数：427 相关文章所有 3 个版本

[PDF] arxiv.org

Maxvit: Multi-axis vision transformer

Z Tu, H Talebi, H Zhang, F Yang, P Milanfar… - European conference on …, 2022 - Springer

Transformers have recently gained significant attention in the computer vision community.
However, the lack of scalability of self-attention mechanisms with respect to image size has …

被引用次数：498 相关文章所有 8 个版本

[PDF] thecvf.com

Bidirectional copy-paste for semi-supervised medical image segmentation

Y Bai, D Chen, Q Li, W Shen… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

In semi-supervised medical image segmentation, there exist empirical mismatch problems
between labeled and unlabeled data distribution. The knowledge learned from the labeled …

被引用次数：84 相关文章所有 5 个版本

[PDF] thecvf.com

Swin transformer: Hierarchical vision transformer using shifted windows

Z Liu, Y Lin, Y Cao, H Hu, Y Wei… - Proceedings of the …, 2021 - openaccess.thecvf.com

This paper presents a new vision Transformer, called Swin Transformer, that capably serves
as a general-purpose backbone for computer vision. Challenges in adapting Transformer …

被引用次数：19981 相关文章所有 9 个版本

[PDF] arxiv.org

Focal self-attention for local-global interactions in vision transformers

J Yang, C Li, P Zhang, X Dai, B Xiao, L Yuan… - arXiv preprint arXiv …, 2021 - arxiv.org

Recently, Vision Transformer and its variants have shown great promise on various
computer vision tasks. The ability of capturing short-and long-range visual dependencies …

被引用次数：423 相关文章所有 2 个版本

[PDF] thecvf.com

Simple copy-paste is a strong data augmentation method for instance segmentation

G Ghiasi, Y Cui, A Srinivas, R Qian… - Proceedings of the …, 2021 - openaccess.thecvf.com

Building instance segmentation models that are data-efficient and can handle rare object
categories is an important challenge in computer vision. Leveraging data augmentations is a …

被引用次数：1025 相关文章所有 7 个版本

[PDF] arxiv.org

Improved YOLOv5 network for real-time multi-scale traffic sign detection

J Wang, Y Chen, Z Dong, M Gao - Neural Computing and Applications, 2023 - Springer

Traffic sign detection is a challenging task for the unmanned driving system, especially for
the detection of multi-scale targets and the real-time problem of detection. In the traffic sign …

被引用次数：186 相关文章所有 8 个版本

[PDF] neurips.cc

Focal attention for long-range interactions in vision transformers

J Yang, C Li, P Zhang, X Dai, B Xiao… - Advances in Neural …, 2021 - proceedings.neurips.cc

Abstract Recently, Vision Transformer and its variants have shown great promise on various
computer vision tasks. The ability to capture local and global visual dependencies through …

被引用次数：129 相关文章所有 7 个版本

高级搜索

QQ 群