[HTML][HTML] RS-CLIP: Zero shot remote sensing scene classification via contrastive vision-language supervision

X Li, C Wen, Y Hu, N Zhou - … Journal of Applied Earth Observation and …, 2023 - Elsevier
Zero-shot remote sensing scene classification aims to solve the scene classification problem
on unseen categories and has attracted numerous research attention in the remote sensing …

Vision-language models in remote sensing: Current progress and future trends

X Li, C Wen, Y Hu, Z Yuan… - IEEE Geoscience and …, 2024 - ieeexplore.ieee.org
The remarkable achievements of ChatGPT and Generative Pre-trained Transformer 4 (GPT-
4) have sparked a wave of interest and research in the field of large language models …

Side adapter network for open-vocabulary semantic segmentation

M Xu, Z Zhang, F Wei, H Hu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
This paper presents a new framework for open-vocabulary semantic segmentation with the
pre-trained vision-language model, named SAN. Our approach models the semantic …

Vision-language models for vision tasks: A survey

J Zhang, J Huang, S Jin, S Lu - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org
Most visual recognition studies rely heavily on crowd-labelled data in deep neural networks
(DNNs) training, and they usually train a DNN for each single visual recognition task …

Instructdiffusion: A generalist modeling interface for vision tasks

Z Geng, B Yang, T Hang, C Li, S Gu… - Proceedings of the …, 2024 - openaccess.thecvf.com
We present InstructDiffusion a unified and generic framework for aligning computer vision
tasks with human instructions. Unlike existing approaches that integrate prior knowledge …

Segclip: Patch aggregation with learnable centers for open-vocabulary semantic segmentation

H Luo, J Bao, Y Wu, X He, T Li - International Conference on …, 2023 - proceedings.mlr.press
Recently, the contrastive language-image pre-training, eg, CLIP, has demonstrated
promising results on various downstream tasks. The pre-trained model can capture enriched …

Clip2point: Transfer clip to point cloud classification with image-depth pre-training

T Huang, B Dong, Y Yang, X Huang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Pre-training across 3D vision and language remains under development because of limited
training data. Recent works attempt to transfer vision-language (VL) pre-training methods to …

Winclip: Zero-/few-shot anomaly classification and segmentation

J Jeong, Y Zou, T Kim, D Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Visual anomaly classification and segmentation are vital for automating industrial quality
inspection. The focus of prior research in the field has been on training custom models for …

Transformer-based visual segmentation: A survey

X Li, H Ding, H Yuan, W Zhang, J Pang… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
Visual segmentation seeks to partition images, video frames, or point clouds into multiple
segments or groups. This technique has numerous real-world applications, such as …

Towards open vocabulary learning: A survey

J Wu, X Li, S Xu, H Yuan, H Ding… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
In the field of visual scene understanding, deep neural networks have made impressive
advancements in various core tasks like segmentation, tracking, and detection. However …