Transfg: A transformer architecture for fine-grained recognition

K Islam - arXiv preprint arXiv:2203.01536, 2022 - arxiv.org

Vision Transformers (ViTs) are becoming more popular and dominating technique for
various vision tasks, compare to Convolutional Neural Networks (CNNs). As a demanding …

被引用次数：44 相关文章所有 2 个版本

[HTML] sciencedirect.com

[HTML][HTML] TransU-Net++: Rethinking attention gated TransU-Net for deforestation mapping

A Jamali, SK Roy, J Li, P Ghamisi - International Journal of Applied Earth …, 2023 - Elsevier

Deforestation has become a major cause of climate change, and as a result, both
characterizing the drivers and estimating segmentation maps of deforestation have piqued …

被引用次数：14 相关文章所有 4 个版本

[PDF] thecvf.com

Dual cross-attention learning for fine-grained visual categorization and object re-identification

H Zhu, W Ke, D Li, J Liu, L Tian… - Proceedings of the …, 2022 - openaccess.thecvf.com

Recently, self-attention mechanisms have shown impressive performance in various NLP
and CV tasks, which can help capture sequential characteristics and derive global …

被引用次数：129 相关文章所有 5 个版本

[PDF] thecvf.com

Generative prompt model for weakly supervised object localization

Y Zhao, Q Ye, W Wu, C Shen… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Weakly supervised object localization (WSOL) remains challenging when learning object
localization models from image category labels. Conventional methods that discriminatively …

被引用次数：19 相关文章所有 7 个版本

TransIFC: Invariant cues-aware feature concentration learning for efficient fine-grained bird image classification

H Liu, C Zhang, Y Deng, B Xie, T Liu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Fine-grained bird image classification (FBIC) is not only meaningful for endangered bird
observation and protection but also a prevalent task for image classification in multimedia …

被引用次数：63 相关文章所有 3 个版本

[PDF] arxiv.org

Feature fusion vision transformer for fine-grained visual categorization

J Wang, X Yu, Y Gao - arXiv preprint arXiv:2107.02341, 2021 - arxiv.org

The core for tackling the fine-grained visual categorization (FGVC) is to learn subtle yet
discriminative features. Most previous works achieve this by explicitly selecting the …

被引用次数：131 相关文章所有 9 个版本

[PDF] thecvf.com

Transmix: Attend to mix for vision transformers

JN Chen, S Sun, J He, PHS Torr… - Proceedings of the …, 2022 - openaccess.thecvf.com

Mixup-based augmentation has been found to be effective for generalizing models during
training, especially for Vision Transformers (ViTs) since they can easily overfit. However …

被引用次数：90 相关文章所有 7 个版本

[PDF] thecvf.com

Learning bottleneck concepts in image classification

B Wang, L Li, Y Nakashima… - Proceedings of the ieee …, 2023 - openaccess.thecvf.com

Interpreting and explaining the behavior of deep neural networks is critical for many tasks.
Explainable AI provides a way to address this challenge, mostly by providing per-pixel …

被引用次数：33 相关文章所有 6 个版本

[PDF] thecvf.com

Which tokens to use? investigating token reduction in vision transformers

JB Haurum, S Escalera, GW Taylor… - Proceedings of the …, 2023 - openaccess.thecvf.com

Since the introduction of the Vision Transformer (ViT), researchers have sought to make ViTs
more efficient by removing redundant information in the processed tokens. While different …

被引用次数：18 相关文章所有 6 个版本

[PDF] mlr.press

Vit-net: Interpretable vision transformers with neural tree decoder

S Kim, J Nam, BC Ko - International conference on machine …, 2022 - proceedings.mlr.press

Vision transformers (ViTs), which have demonstrated a state-of-the-art performance in image
classification, can also visualize global interpretations through attention-based contributions …

被引用次数：56 相关文章所有 5 个版本

高级搜索

QQ 群