Towards open vocabulary learning: A survey

J Wu, X Li, S Xu, H Yuan, H Ding… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
In the field of visual scene understanding, deep neural networks have made impressive
advancements in various core tasks like segmentation, tracking, and detection. However …

A survey on open-vocabulary detection and segmentation: Past, present, and future

C Zhu, L Chen - IEEE Transactions on Pattern Analysis and …, 2024 - ieeexplore.ieee.org
As the most fundamental scene understanding tasks, object detection and segmentation
have made tremendous progress in deep learning era. Due to the expensive manual …

Dst-det: Simple dynamic self-training for open-vocabulary object detection

S Xu, X Li, S Wu, W Zhang, Y Li, G Cheng… - arXiv preprint arXiv …, 2023 - arxiv.org
Open-vocabulary object detection (OVOD) aims to detect the objects beyond the set of
categories observed during training. This work presents a simple yet effective strategy that …

Hyperbolic learning with synthetic captions for open-world detection

F Kong, Y Chen, J Cai… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Open-world detection poses significant challenges as it requires the detection of any object
using either object class labels or free-form texts. Existing related works often use large …

Approaching outside: scaling unsupervised 3D object detection from 2D scene

R Zhang, H Zhang, H Yu, Z Zheng - European Conference on Computer …, 2024 - Springer
The unsupervised 3D object detection is to accurately detect objects in unstructured
environments with no explicit supervisory signals. This task, given sparse LiDAR point …

Mixed-Query Transformer: A Unified Image Segmentation Architecture

P Wang, Z Cai, H Yang, A Swaminathan… - arXiv preprint arXiv …, 2024 - arxiv.org
Existing unified image segmentation models either employ a unified architecture across
multiple tasks but use separate weights tailored to each dataset, or apply a single set of …

Asymmetric Aggregation Network for Accurate Ship Detection in Optical Imagery

Y Zhang, MJ Er - IEEE Transactions on Geoscience and …, 2024 - ieeexplore.ieee.org
Optical imagery ship detection has achieved significant developments recently. However,
accurate detection in complex scenes and for different-scale ships remains a vital challenge …

[HTML][HTML] VisualSiteDiary: A detector-free Vision-Language Transformer model for captioning photologs for daily construction reporting and image retrievals

Y Jung, I Cho, SH Hsu, M Golparvar-Fard - Automation in Construction, 2024 - Elsevier
This paper presents VisualSiteDiary, a Vision Transformer-based image captioning model
which creates human-readable captions for daily progress and work activity log, and …

Retrieval-Augmented Open-Vocabulary Object Detection

J Kim, E Cho, S Kim, HJ Kim - Proceedings of the IEEE/CVF …, 2024 - openaccess.thecvf.com
Open-vocabulary object detection (OVD) has been studied with Vision-Language Models
(VLMs) to detect novel objects beyond the pre-trained categories. Previous approaches …

DST-Det: Open-Vocabulary Object Detection via Dynamic Self-Training

S Xu, X Li, S Wu, W Zhang, Y Tong… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Open-vocabulary object detection (OVOD) aims to detect the objects beyond the set of
classes observed during training. This work introduces a straightforward and efficient …