- 学术资源搜索

[HTML][HTML] RS-CLIP: Zero shot remote sensing scene classification via contrastive vision-language supervision

X Li, C Wen, Y Hu, N Zhou - … Journal of Applied Earth Observation and …, 2023 - Elsevier

Zero-shot remote sensing scene classification aims to solve the scene classification problem
on unseen categories and has attracted numerous research attention in the remote sensing …

被引用次数：27 相关文章所有 3 个版本

[PDF] arxiv.org

Vision-language models in remote sensing: Current progress and future trends

X Li, C Wen, Y Hu, Z Yuan… - IEEE Geoscience and …, 2024 - ieeexplore.ieee.org

The remarkable achievements of ChatGPT and Generative Pre-trained Transformer 4 (GPT-
4) have sparked a wave of interest and research in the field of large language models …

被引用次数：36 相关文章所有 5 个版本

[PDF] thecvf.com

Side adapter network for open-vocabulary semantic segmentation

M Xu, Z Zhang, F Wei, H Hu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

This paper presents a new framework for open-vocabulary semantic segmentation with the
pre-trained vision-language model, named SAN. Our approach models the semantic …

被引用次数：190 相关文章所有 6 个版本

[PDF] arxiv.org

Vision-language models for vision tasks: A survey

J Zhang, J Huang, S Jin, S Lu - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org

Most visual recognition studies rely heavily on crowd-labelled data in deep neural networks
(DNNs) training, and they usually train a DNN for each single visual recognition task …

被引用次数：228 相关文章所有 9 个版本

[PDF] thecvf.com

Instructdiffusion: A generalist modeling interface for vision tasks

Z Geng, B Yang, T Hang, C Li, S Gu… - Proceedings of the …, 2024 - openaccess.thecvf.com

We present InstructDiffusion a unified and generic framework for aligning computer vision
tasks with human instructions. Unlike existing approaches that integrate prior knowledge …

被引用次数：52 相关文章所有 3 个版本

[PDF] mlr.press

Segclip: Patch aggregation with learnable centers for open-vocabulary semantic segmentation

H Luo, J Bao, Y Wu, X He, T Li - International Conference on …, 2023 - proceedings.mlr.press

Recently, the contrastive language-image pre-training, eg, CLIP, has demonstrated
promising results on various downstream tasks. The pre-trained model can capture enriched …

被引用次数：109 相关文章所有 6 个版本

[PDF] thecvf.com

Clip2point: Transfer clip to point cloud classification with image-depth pre-training

T Huang, B Dong, Y Yang, X Huang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Pre-training across 3D vision and language remains under development because of limited
training data. Recent works attempt to transfer vision-language (VL) pre-training methods to …

被引用次数：103 相关文章所有 7 个版本

[PDF] thecvf.com

Winclip: Zero-/few-shot anomaly classification and segmentation

J Jeong, Y Zou, T Kim, D Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Visual anomaly classification and segmentation are vital for automating industrial quality
inspection. The focus of prior research in the field has been on training custom models for …

被引用次数：127 相关文章所有 10 个版本

[PDF] ieee.org

Transformer-based visual segmentation: A survey

X Li, H Ding, H Yuan, W Zhang, J Pang… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

Visual segmentation seeks to partition images, video frames, or point clouds into multiple
segments or groups. This technique has numerous real-world applications, such as …

被引用次数：69 相关文章所有 3 个版本

[PDF] ieee.org

Towards open vocabulary learning: A survey

J Wu, X Li, S Xu, H Yuan, H Ding… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

In the field of visual scene understanding, deep neural networks have made impressive
advancements in various core tasks like segmentation, tracking, and detection. However …

被引用次数：79 相关文章所有 10 个版本

高级搜索

QQ 群

[HTML][HTML] RS-CLIP: Zero shot remote sensing scene classification via contrastive vision-language supervision

Vision-language models in remote sensing: Current progress and future trends

Side adapter network for open-vocabulary semantic segmentation

Vision-language models for vision tasks: A survey

Instructdiffusion: A generalist modeling interface for vision tasks

Segclip: Patch aggregation with learnable centers for open-vocabulary semantic segmentation

Clip2point: Transfer clip to point cloud classification with image-depth pre-training

Winclip: Zero-/few-shot anomaly classification and segmentation

Transformer-based visual segmentation: A survey

Towards open vocabulary learning: A survey

引用