- 学术资源搜索

A comprehensive review of YOLO: From YOLOv1 to YOLOv8 and beyond

J Terven, D Cordova-Esparza - arXiv preprint arXiv:2304.00501, 2023 - arxiv.org

YOLO has become a central real-time object detection system for robotics, driverless cars,
and video monitoring applications. We present a comprehensive analysis of YOLO's …

被引用次数：595 相关文章所有 3 个版本

[PDF] arxiv.org

A comprehensive survey on pretrained foundation models: A history from bert to chatgpt

C Zhou, Q Li, C Li, J Yu, Y Liu, G Wang… - arXiv preprint arXiv …, 2023 - arxiv.org

Pretrained Foundation Models (PFMs) are regarded as the foundation for various
downstream tasks with different data modalities. A PFM (eg, BERT, ChatGPT, and GPT-4) is …

被引用次数：369 相关文章所有 2 个版本

[PDF] thecvf.com

Internimage: Exploring large-scale vision foundation models with deformable convolutions

W Wang, J Dai, Z Chen, Z Huang, Z Li… - Proceedings of the …, 2023 - openaccess.thecvf.com

Compared to the great progress of large-scale vision transformers (ViTs) in recent years,
large-scale models based on convolutional neural networks (CNNs) are still in an early …

被引用次数：426 相关文章所有 8 个版本

[PDF] neurips.cc

Segment everything everywhere all at once

X Zou, J Yang, H Zhang, F Li, L Li… - Advances in …, 2024 - proceedings.neurips.cc

In this work, we present SEEM, a promotable and interactive model for segmenting
everything everywhere all at once in an image. In SEEM, we propose a novel and versatile …

被引用次数：283 相关文章所有 4 个版本

[PDF] neurips.cc

Segnext: Rethinking convolutional attention design for semantic segmentation

MH Guo, CZ Lu, Q Hou, Z Liu… - Advances in Neural …, 2022 - proceedings.neurips.cc

We present SegNeXt, a simple convolutional network architecture for semantic
segmentation. Recent transformer-based models have dominated the field of se-mantic …

被引用次数：396 相关文章所有 5 个版本

[PDF] thecvf.com

Lisa: Reasoning segmentation via large language model

X Lai, Z Tian, Y Chen, Y Li, Y Yuan… - Proceedings of the …, 2024 - openaccess.thecvf.com

Although perception systems have made remarkable advancements in recent years they still
rely on explicit human instruction or pre-defined categories to identify the target objects …

被引用次数：151 相关文章所有 2 个版本

[PDF] thecvf.com

Open-vocabulary semantic segmentation with mask-adapted clip

F Liang, B Wu, X Dai, K Li, Y Zhao… - Proceedings of the …, 2023 - openaccess.thecvf.com

Open-vocabulary semantic segmentation aims to segment an image into semantic regions
according to text descriptions, which may not have been seen during training. Recent two …

被引用次数：262 相关文章所有 7 个版本

[HTML] sciencedirect.com

[HTML][HTML] Artificial intelligence and smart vision for building and construction 4.0: Machine and deep learning methods and applications

SK Baduge, S Thilakarathna, JS Perera… - Automation in …, 2022 - Elsevier

This article presents a state-of-the-art review of the applications of Artificial Intelligence (AI),
Machine Learning (ML), and Deep Learning (DL) in building and construction industry 4.0 in …

被引用次数：366 相关文章所有 6 个版本

[PDF] thecvf.com

Scaling up your kernels to 31x31: Revisiting large kernel design in cnns

X Ding, X Zhang, J Han, G Ding - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com

We revisit large kernel design in modern convolutional neural networks (CNNs). Inspired by
recent advances in vision transformers (ViTs), in this paper, we demonstrate that using a few …

被引用次数：712 相关文章所有 10 个版本

[PDF] springer.com

Visual attention network

MH Guo, CZ Lu, ZN Liu, MM Cheng, SM Hu - Computational Visual Media, 2023 - Springer

While originally designed for natural language processing tasks, the self-attention
mechanism has recently taken various computer vision areas by storm. However, the 2D …

被引用次数：490 相关文章所有 8 个版本

高级搜索

QQ 群

A comprehensive review of YOLO: From YOLOv1 to YOLOv8 and beyond

A comprehensive survey on pretrained foundation models: A history from bert to chatgpt

Internimage: Exploring large-scale vision foundation models with deformable convolutions

Segment everything everywhere all at once

Segnext: Rethinking convolutional attention design for semantic segmentation

Lisa: Reasoning segmentation via large language model

Open-vocabulary semantic segmentation with mask-adapted clip

[HTML][HTML] Artificial intelligence and smart vision for building and construction 4.0: Machine and deep learning methods and applications

Scaling up your kernels to 31x31: Revisiting large kernel design in cnns

Visual attention network

引用