Lisa: Reasoning segmentation via large language model

X Lai, Z Tian, Y Chen, Y Li, Y Yuan… - Proceedings of the …, 2024 - openaccess.thecvf.com
Although perception systems have made remarkable advancements in recent years they still
rely on explicit human instruction or pre-defined categories to identify the target objects …

Spherical transformer for lidar-based 3d recognition

X Lai, Y Chen, F Lu, J Liu, J Jia - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
LiDAR-based 3D point cloud recognition has benefited various applications. Without
specially considering the LiDAR point distribution, most current methods suffer from …

Hierarchical dense correlation distillation for few-shot segmentation

B Peng, Z Tian, X Wu, C Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Few-shot semantic segmentation (FSS) aims to form class-agnostic models segmenting
unseen classes with only a handful of annotations. Previous methods limited to the semantic …

See Say and Segment: Teaching LMMs to Overcome False Premises

TH Wu, G Biamby, D Chan, L Dunlap… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract Current open-source Large Multimodal Models (LMMs) excel at tasks such as open-
vocabulary language grounding and segmentation but can suffer under false premises …

OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation

B Peng, X Wu, L Jiang, Y Chen… - Proceedings of the …, 2024 - openaccess.thecvf.com
The booming of 3D recognition in the 2020s began with the introduction of point cloud
transformers. They quickly overwhelmed sparse CNNs and became state-of-the-art models …

Boosting few-shot 3d point cloud segmentation via query-guided enhancement

Z Ning, Z Tian, G Lu, W Pei - Proceedings of the 31st ACM International …, 2023 - dl.acm.org
Although extensive research has been conducted on 3D point cloud segmentation,
effectively adapting generic models to novel categories remains a formidable challenge …

SaCo Loss: Sample-wise Affinity Consistency for Vision-Language Pre-training

S Wu, H Tan, Z Tian, Y Chen… - Proceedings of the …, 2024 - openaccess.thecvf.com
Vision-language pre-training (VLP) aims to learn joint representations of vision and
language modalities. The contrastive paradigm is currently dominant in this field. However …

Unified Language-driven Zero-shot Domain Adaptation

S Yang, Z Tian, L Jiang, J Jia - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Abstract This paper introduces Unified Language-driven Zero-shot Domain Adaptation
(ULDA) a novel task setting that enables a single model to adapt to diverse target domains …

Rethinking Prior Information Generation with CLIP for Few-Shot Segmentation

J Wang, B Zhang, J Pang, H Chen… - Proceedings of the …, 2024 - openaccess.thecvf.com
Few-shot segmentation remains challenging due to the limitations of its labeling information
for unseen classes. Most previous approaches rely on extracting high-level feature maps …

Semantic and Spatial Adaptive Pixel-level Classifier for Semantic Segmentation

X Ma, Z Ni, X Chen - arXiv preprint arXiv:2405.06525, 2024 - arxiv.org
Vanilla pixel-level classifiers for semantic segmentation are based on a certain paradigm,
involving the inner product of fixed prototypes obtained from the training set and pixel …