Ovarnet: Towards open-vocabulary object attribute recognition

K Chen, X Jiang, Y Hu, X Tang… - Proceedings of the …, 2023 - openaccess.thecvf.com
In this paper, we consider the problem of simultaneously detecting objects and inferring their
visual attributes in an image, even for those with no manual annotations provided at the …

Chop & learn: Recognizing and generating object-state compositions

N Saini, H Wang, A Swaminathan… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recognizing and generating object-state compositions has been a challenging task,
especially when generalizing to unseen compositions. In this paper, we study the task of …

Learning conditional attributes for compositional zero-shot learning

Q Wang, L Liu, C Jing, H Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract Compositional Zero-Shot Learning (CZSL) aims to train models to recognize novel
compositional concepts based on learned concepts such as attribute-object combinations …

Composing object relations and attributes for image-text matching

K Pham, C Huynh, SN Lim… - Proceedings of the …, 2024 - openaccess.thecvf.com
We study the visual semantic embedding problem for image-text matching. Most existing
work utilizes a tailored cross-attention mechanism to perform local alignment across the two …

Improving closed and open-vocabulary attribute prediction using transformers

K Pham, K Kafle, Z Lin, Z Ding, S Cohen, Q Tran… - European conference on …, 2022 - Springer
We study recognizing attributes for objects in visual scenes. We consider attributes to be any
phrases that describe an object's physical and semantic properties, and its relationships with …

Learning attention as disentangler for compositional zero-shot learning

S Hao, K Han, KYK Wong - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com
Compositional zero-shot learning (CZSL) aims at learning visual concepts (ie, attributes and
objects) from seen compositions and combining concept knowledge into unseen …

Multi-task learning of object states and state-modifying actions from web videos

T Soucek, JB Alayrac, A Miech, I Laptev… - IEEE Transactions on …, 2024 - computer.org
We aim to learn to temporally localize object state changes and the corresponding state-
modifying actions by observing people interacting with objects in long uncurated web …

Hierarchical visual primitive experts for compositional zero-shot learning

H Kim, J Lee, S Park, K Sohn - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Compositional zero-shot learning (CZSL) aims to recognize unseen compositions with prior
knowledge of known primitives (attribute and object). Previous works for CZSL often suffer …

Simpson: Simplifying photo cleanup with single-click distracting object segmentation network

C Huynh, Y Zhou, Z Lin, C Barnes… - Proceedings of the …, 2023 - openaccess.thecvf.com
In photo editing, it is common practice to remove visual distractions to improve the overall
image quality and highlight the primary subject. However, manually selecting and removing …

Multi-task learning of object state changes from uncurated videos

T Souček, JB Alayrac, A Miech, I Laptev… - arXiv preprint arXiv …, 2022 - arxiv.org
We aim to learn to temporally localize object state changes and the corresponding state-
modifying actions by observing people interacting with objects in long uncurated web …