Representation and recognition in vision- 学术资源搜索

Deep high-resolution representation learning for visual recognition

J Wang, K Sun, T Cheng, B Jiang… - IEEE transactions on …, 2020 - ieeexplore.ieee.org

… representations are essential for position-sensitive vision … representation through a
subnetwork that is formed by … high-resolution representation from the encoded low-resolution …

被引用次数：3518 相关文章所有 10 个版本

[PDF] thecvf.com

Towards universal representation learning for deep face recognition

Y Shi, X Yu, K Sohn, M Chandraker… - … pattern recognition, 2020 - openaccess.thecvf.com

… Instead, we propose a universal representation learning face recognition framework, URFace,
that can deal with larger variations unseen in the given training data, without leveraging …

被引用次数：170 相关文章所有 8 个版本

[PDF] arxiv.org

Visual transformers: Token-based image representation and processing for computer vision

B Wu, C Xu, X Dai, A Wan, P Zhang, Z Yan… - arXiv preprint arXiv …, 2020 - arxiv.org

Computer vision has achieved remarkable success by (a) representing images as uniformly-arranged
pixel arrays and (b) convolving highly-localized features. However, convolutions …

被引用次数：505 相关文章所有 3 个版本

[PDF] arxiv.org

Volo: Vision outlooker for visual recognition

L Yuan, Q Hou, Z Jiang, J Feng… - IEEE transactions on …, 2022 - ieeexplore.ieee.org

… -to-token representation learning first proposed in our conference version with outlook
attention and presented a new model, Vision Outlooker (VOLO), for solving computer vision tasks. …

被引用次数：281 相关文章所有 7 个版本

[PDF] thecvf.com

Seeing out of the box: End-to-end pre-training for vision-language representation learning

Z Huang, Z Zeng, Y Huang, B Liu… - … pattern recognition, 2021 - openaccess.thecvf.com

… visual representations … visionlanguage tasks [17] or vision recognition tasks [9, 32]. Our work
shares a similar format of visual representation with [17] while we focus on the area of vision-…

被引用次数：268 相关文章所有 6 个版本

[PDF] arxiv.org

Vision mamba: Efficient visual representation learning with bidirectional state space model

L Zhu, B Liao, Q Zhang, X Wang, W Liu… - arXiv preprint arXiv …, 2024 - arxiv.org

… the success of Mamba to vision, ie, building a generic vision backbone purely upon … Inspired
by ViT [14] and BERT [31], we also use class token to represent the whole patch sequence, …

被引用次数：250 相关文章所有 5 个版本

[PDF] thecvf.com

Learning semantic-specific graph representation for multi-label image recognition

T Chen, M Xu, X Hui, H Wu… - … on computer vision, 2019 - openaccess.thecvf.com

… To address these issues, we propose a Semantic-Specific Graph Representation Learning
(… representations and 2) a semantic interaction module that correlates these representations …

被引用次数：306 相关文章所有 8 个版本

[HTML] mdpi.com

[HTML][HTML] A comprehensive survey of vision-based human action recognition methods

HB Zhang, YX Zhang, B Zhong, Q Lei, L Yang, JX Du… - Sensors, 2019 - mdpi.com

… Feature representation and selection is a classic problem in computer vision and machine
learning [8]. Unlike feature representation in an image space, the feature representation of …

被引用次数：503 相关文章所有 9 个版本

[PDF] thecvf.com

Gloria: A multimodal global-local representation learning framework for label-efficient medical image recognition

SC Huang, L Shen, MP Lungren… - … on Computer Vision, 2021 - openaccess.thecvf.com

… label-efficient multimodal medical imaging representations by leveraging radiology reports.
… the learned representations for various downstream medical image recognition tasks with …

被引用次数：208 相关文章所有 6 个版本

[PDF] thecvf.com

12-in-1: Multi-task vision and language representation learning

J Lu, V Goswami, M Rohrbach… - … pattern recognition, 2020 - openaccess.thecvf.com

Much of vision-and-language research focuses on a small but diverse set of independent
tasks and supporting datasets often studied in isolation; however, the visually-grounded …

被引用次数：515 相关文章所有 7 个版本

高级搜索

QQ 群

Deep high-resolution representation learning for visual recognition

Towards universal representation learning for deep face recognition

Visual transformers: Token-based image representation and processing for computer vision

Volo: Vision outlooker for visual recognition

Seeing out of the box: End-to-end pre-training for vision-language representation learning

Vision mamba: Efficient visual representation learning with bidirectional state space model

Learning semantic-specific graph representation for multi-label image recognition

[HTML][HTML] A comprehensive survey of vision-based human action recognition methods

Gloria: A multimodal global-local representation learning framework for label-efficient medical image recognition

12-in-1: Multi-task vision and language representation learning

相关搜索

引用