M Li, H Wang, W Zhang, J Miao… - Proceedings of the …, 2023 - openaccess.thecvf.com
Spatio-temporal video grounding aims to localize the aligned visual tube corresponding to a language query. Existing techniques achieve such alignment by exploiting dense boundary …
G Li, V Jampani, D Sun… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Humans excel at acquiring knowledge through observation. For example, we can learn to use new tools by watching demonstrations. This skill is fundamental for intelligent systems to …
Abstract Weakly Supervised Object Localization (WSOL), which aims to localize objects by only using image-level labels, has attracted much attention because of its low annotation …
W Zhai, P Wu, K Zhu, Y Cao, F Wu, ZJ Zha - International Journal of …, 2024 - Springer
Weakly supervised object localization and semantic segmentation aim to localize objects using only image-level labels. Recently, a new paradigm has emerged by generating a …
J Lee, S Lee, J Nam, S Yu, J Do… - Proceedings of the …, 2023 - openaccess.thecvf.com
Referring image segmentation (RIS) aims to localize the object in an image referred by a natural language expression. Most previous studies learn RIS with a large-scale dataset …
Z Chen, J Ding, L Cao, Y Shen… - Proceedings of the …, 2023 - openaccess.thecvf.com
Weakly supervised object localization (WSOL) aims to localize objects based on only image- level labels as supervision. Recently, transformers have been introduced into WSOL …
Self-supervised vision transformers (SSTs) have shown great potential to yield rich localization maps that highlight different objects in an image. However, these maps remain …
J Lee, E Kim, J Mok, S Yoon - IEEE transactions on pattern …, 2022 - ieeexplore.ieee.org
Obtaining accurate pixel-level localization from class labels is a crucial process in weakly supervised semantic segmentation and object localization. Attribution maps from a trained …