Animating images to transfer clip for video-text retrieval

P Wu, X Zhou, G Pang, Y Sun, J Liu… - Proceedings of the …, 2024 - openaccess.thecvf.com

Current video anomaly detection (VAD) approaches with weak supervisions are inherently
limited to a closed-set setting and may struggle in open-world applications where there can …

被引用次数：20 相关文章所有 4 个版本

[PDF] thecvf.com

Dual learning with dynamic knowledge distillation for partially relevant video retrieval

J Dong, M Zhang, Z Zhang, X Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com

Almost all previous text-to-video retrieval works assume that videos are pre-trimmed with
short durations. However, in practice, videos are generally untrimmed containing much …

被引用次数：13 相关文章所有 4 个版本

[PDF] thecvf.com

Dual alignment unsupervised domain adaptation for video-text retrieval

X Hao, W Zhang, D Wu, F Zhu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Video-text retrieval is an emerging stream in both computer vision and natural language
processing communities, which aims to find relevant videos given text queries. In this paper …

被引用次数：20 相关文章所有 3 个版本

Fine-grained textual inversion network for zero-shot composed image retrieval

H Lin, H Wen, X Song, M Liu, Y Hu, L Nie - Proceedings of the 47th …, 2024 - dl.acm.org

Composed Image Retrieval (CIR) allows users to search target images with a multimodal
query, comprising a reference image and a modification text that describes the user's …

被引用次数：5 相关文章所有 2 个版本

[PDF] archive.org

Adapting generative pretrained language model for open-domain multimodal sentence summarization

D Lin, L Jing, X Song, M Liu, T Sun, L Nie - Proceedings of the 46th …, 2023 - dl.acm.org

Multimodal sentence summarization, aiming to generate a brief summary of the source
sentence and image, is a new yet challenging task. Although existing methods have …

被引用次数：12 相关文章所有 2 个版本

[PDF] neurips.cc

Uncertainty-aware alignment network for cross-domain video-text retrieval

X Hao, W Zhang - Advances in Neural Information …, 2024 - proceedings.neurips.cc

Video-text retrieval is an important but challenging research task in the multimedia
community. In this paper, we address the challenge task of Unsupervised Domain …

被引用次数：8 相关文章所有 3 个版本

[PDF] thecvf.com

Text Is MASS: Modeling as Stochastic Embedding for Text-Video Retrieval

J Wang, G Sun, P Wang, D Liu… - Proceedings of the …, 2024 - openaccess.thecvf.com

The increasing prevalence of video clips has sparked growing interest in text-video retrieval.
Recent advances focus on establishing a joint embedding space for text and video relying …

被引用次数：20 相关文章所有 3 个版本

[PDF] thecvf.com

Audio-enhanced text-to-video retrieval using text-conditioned feature alignment

S Ibrahimi, X Sun, P Wang, A Garg… - Proceedings of the …, 2023 - openaccess.thecvf.com

Text-to-video retrieval systems have recently made significant progress by utilizing pre-
trained models trained on large-scale image-text pairs. However, most of the latest methods …

被引用次数：14 相关文章所有 6 个版本

[PDF] acm.org

Text-to-motion retrieval: Towards joint understanding of human motion data and natural language

N Messina, J Sedmidubsky, F Falchi… - Proceedings of the 46th …, 2023 - dl.acm.org

Due to recent advances in pose-estimation methods, human motion can be extracted from a
common video in the form of 3D skeleton sequences. Despite wonderful application …

被引用次数：13 相关文章所有 5 个版本

Text-Video Retrieval via Multi-Modal Hypergraph Networks

Q Li, L Su, J Zhao, L Xia, H Cai, S Cheng… - Proceedings of the 17th …, 2024 - dl.acm.org

Text-video retrieval is a challenging task that aims to identify relevant videos given textual
queries. Compared to conventional textual retrieval, the main obstacle for text-video retrieval …

被引用次数：5 相关文章

高级搜索

QQ 群