Recently, zero-shot and few-shot learning via Contrastive Vision-Language Pre-training (CLIP) have shown inspirational performance on 2D visual recognition, which learns to …
Pre-training across 3D vision and language remains under development because of limited training data. Recent works attempt to transfer vision-language (VL) pre-training methods to …
Y Zeng, C Jiang, J Mao, J Han, C Ye… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract Contrastive Language-Image Pre-training, benefiting from large-scale unlabeled text-image pairs, has demonstrated great performance in open-world vision understanding …
Recent advances in 3D semantic segmentation with deep neural networks have shown remarkable success, with rapid performance increase on available datasets. However …
Large-scale pre-trained models have shown promising open-world performance for both vision and language tasks. However, their transferred capacity on 3D point clouds is still …
J Zhang, R Dong, K Ma - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Training a 3D scene understanding model requires complicated human annotations, which are laborious to collect and result in a model only encoding close-set object semantics. In …
Y Lu, Q Jiang, R Chen, Y Hou… - Proceedings of the …, 2023 - openaccess.thecvf.com
Zero-shot point cloud segmentation aims to make deep models capable of recognizing novel objects in point cloud that are unseen in the training phase. Recent trends favor the …
Few-shot class incremental learning (FSCIL) aims to incrementally add sets of novel classes to a well-trained base model in multiple training sessions with the restriction that only a few …