Deep high-resolution representation learning for visual recognition

Z Xing, Q Feng, H Chen, Q Dai, H Hu, H Xu… - ACM Computing …, 2024 - dl.acm.org

The recent wave of AI-generated content (AIGC) has witnessed substantial success in
computer vision, with the diffusion model playing a crucial role in this achievement. Due to …

被引用次数：76 相关文章所有 3 个版本

[HTML] mdpi.com

[HTML][HTML] Review of image classification algorithms based on convolutional neural networks

L Chen, S Li, Q Bai, J Yang, S Jiang, Y Miao - Remote Sensing, 2021 - mdpi.com

Image classification has always been a hot research direction in the world, and the
emergence of deep learning has promoted the development of this field. Convolutional …

被引用次数：546 相关文章所有 5 个版本

[PDF] arxiv.org

Vision mamba: Efficient visual representation learning with bidirectional state space model

L Zhu, B Liao, Q Zhang, X Wang, W Liu… - arXiv preprint arXiv …, 2024 - arxiv.org

Recently the state space models (SSMs) with efficient hardware-aware designs, ie, the
Mamba deep learning model, have shown great potential for long sequence modeling …

被引用次数：796 相关文章所有 5 个版本

[PDF] sciencedirect.com

Segment anything model for medical image analysis: an experimental study

MA Mazurowski, H Dong, H Gu, J Yang, N Konz… - Medical Image …, 2023 - Elsevier

Training segmentation models for medical images continues to be challenging due to the
limited availability of data annotations. Segment Anything Model (SAM) is a foundation …

被引用次数：421 相关文章所有 5 个版本

[PDF] neurips.cc

Segnext: Rethinking convolutional attention design for semantic segmentation

MH Guo, CZ Lu, Q Hou, Z Liu… - Advances in Neural …, 2022 - proceedings.neurips.cc

We present SegNeXt, a simple convolutional network architecture for semantic
segmentation. Recent transformer-based models have dominated the field of se-mantic …

被引用次数：634 相关文章所有 6 个版本

[PDF] arxiv.org

Bevfusion: Multi-task multi-sensor fusion with unified bird's-eye view representation

Z Liu, H Tang, A Amini, X Yang, H Mao… - … on robotics and …, 2023 - ieeexplore.ieee.org

Multi-sensor fusion is essential for an accurate and reliable autonomous driving system.
Recent approaches are based on point-level fusion: augmenting the LiDAR point cloud with …

被引用次数：865 相关文章所有 4 个版本

[PDF] thecvf.com

Scaling up your kernels to 31x31: Revisiting large kernel design in cnns

X Ding, X Zhang, J Han, G Ding - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com

We revisit large kernel design in modern convolutional neural networks (CNNs). Inspired by
recent advances in vision transformers (ViTs), in this paper, we demonstrate that using a few …

被引用次数：1040 相关文章所有 10 个版本

[PDF] springer.com

Visual attention network

MH Guo, CZ Lu, ZN Liu, MM Cheng, SM Hu - Computational Visual Media, 2023 - Springer

While originally designed for natural language processing tasks, the self-attention
mechanism has recently taken various computer vision areas by storm. However, the 2D …

被引用次数：739 相关文章所有 8 个版本

An effective CNN and Transformer complementary network for medical image segmentation

F Yuan, Z Zhang, Z Fang - Pattern Recognition, 2023 - Elsevier

The Transformer network was originally proposed for natural language processing. Due to
its powerful representation ability for long-range dependency, it has been extended for …

被引用次数：275 相关文章所有 3 个版本

[PDF] thecvf.com

PIDNet: A real-time semantic segmentation network inspired by PID controllers

J Xu, Z Xiong, SP Bhattacharyya - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Two-branch network architecture has shown its efficiency and effectiveness in real-time
semantic segmentation tasks. However, direct fusion of high-resolution details and low …

被引用次数：328 相关文章所有 8 个版本

高级搜索

QQ 群