Dynamic perceiver for efficient visual recognition

D Han, X Pan, Y Han, S Song… - Proceedings of the …, 2023 - openaccess.thecvf.com

The quadratic computation complexity of self-attention has been a persistent challenge
when applying Transformer models to vision tasks. Linear attention, on the other hand, offers …

被引用次数：81 相关文章所有 5 个版本

[PDF] thecvf.com

Adaptive rotated convolution for rotated object detection

Y Pu, Y Wang, Z Xia, Y Han, Y Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Rotated object detection aims to identify and locate objects in images with arbitrary
orientation. In this scenario, the oriented directions of objects vary considerably across …

被引用次数：51 相关文章所有 6 个版本

[PDF] neurips.cc

Rank-DETR for high quality object detection

Y Pu, W Liang, Y Hao, Y Yuan… - Advances in …, 2024 - proceedings.neurips.cc

Modern detection transformers (DETRs) use a set of object queries to predict a list of
bounding boxes, sort them by their classification confidence scores, and select the top …

被引用次数：22 相关文章所有 5 个版本

[PDF] thecvf.com

Degradation-resistant unfolding network for heterogeneous image fusion

C He, K Li, G Xu, Y Zhang, R Hu… - Proceedings of the …, 2023 - openaccess.thecvf.com

Heterogeneous image fusion (HIF) techniques aim to enhance image quality by merging
complementary information from images captured by different sensors. Among these …

被引用次数：21 相关文章所有 4 个版本

[PDF] thecvf.com

Gsva: Generalized segmentation via multimodal large language models

Z Xia, D Han, Y Han, X Pan, S Song… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract Generalized Referring Expression Segmentation (GRES) extends the scope of
classic RES to refer to multiple objects in one expression or identify the empty targets absent …

被引用次数：8 相关文章所有 3 个版本

[PDF] thecvf.com

Mask grounding for referring image segmentation

YX Chng, H Zheng, Y Han, X Qiu… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract Referring Image Segmentation (RIS) is a challenging task that requires an
algorithm to segment objects referred by free-form language expressions. Despite significant …

被引用次数：4 相关文章所有 2 个版本

[PDF] arxiv.org

Fine-grained recognition with learnable semantic data augmentation

Y Pu, Y Han, Y Wang, J Feng, C Deng… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Fine-grained image recognition is a longstanding computer vision challenge that focuses on
differentiating objects belonging to multiple subordinate categories within the same meta …

被引用次数：11 相关文章所有 6 个版本

[PDF] thecvf.com

Efficienttrain: Exploring generalized curriculum learning for training visual backbones

Y Wang, Y Yue, R Lu, T Liu, Z Zhong… - Proceedings of the …, 2023 - openaccess.thecvf.com

The superior performance of modern deep networks usually comes with a costly training
procedure. This paper presents a new curriculum learning approach for the efficient training …

被引用次数：20 相关文章所有 6 个版本

[PDF] thecvf.com

Deep incubation: Training large models by divide-and-conquering

Z Ni, Y Wang, J Yu, H Jiang, Y Cao… - Proceedings of the …, 2023 - openaccess.thecvf.com

Recent years have witnessed a remarkable success of large deep learning models.
However, training these models is challenging due to high computational costs, painfully …

被引用次数：7 相关文章所有 6 个版本

[PDF] arxiv.org

Agent attention: On the integration of softmax and linear attention

D Han, T Ye, Y Han, Z Xia, S Song, G Huang - arXiv preprint arXiv …, 2023 - arxiv.org

The attention module is the key component in Transformers. While the global attention
mechanism offers high expressiveness, its excessive computational cost restricts its …

被引用次数：13 相关文章所有 2 个版本

高级搜索

QQ 群