Vision systems to see and reason about the compositional nature of visual scenes are fundamental to understanding our world. The complex relations between objects and their …
In this work, we present SEEM, a promotable and interactive model for segmenting everything everywhere all at once in an image. In SEEM, we propose a novel and versatile …
We present X-Decoder, a generalized decoding model that can predict pixel-level segmentation and language tokens seamlessly. X-Decoder takes as input two types of …
Recent progress has shown that large-scale pre-training using contrastive image-text pairs can be a promising alternative for high-quality visual representation learning from natural …
Neural compression is the application of neural networks and other machine learning methods to data compression. Recent advances in statistical machine learning have opened …
C Liu, H Ding, X Jiang - … of the IEEE/CVF conference on …, 2023 - openaccess.thecvf.com
Abstract Referring Expression Segmentation (RES) aims to generate a segmentation mask for the object described by a given language expression. Existing classic RES datasets and …
T Lüddecke, A Ecker - … of the IEEE/CVF conference on …, 2022 - openaccess.thecvf.com
Image segmentation is usually addressed by training a model for a fixed set of object classes. Incorporating additional classes or more complex queries later is expensive as it …
We design an open-vocabulary image segmentation model to organize an image into meaningful regions indicated by arbitrary texts. Recent works (CLIP and ALIGN), despite …
Z Wang, Y Lu, Q Li, X Tao, Y Guo… - Proceedings of the …, 2022 - openaccess.thecvf.com
Referring image segmentation aims to segment a referent via a natural linguistic expression. Due to the distinct data properties between text and image, it is challenging for a network to …