Dual encoding for video retrieval by text

J Dong, X Li, C Xu, X Yang, G Yang… - … on Pattern Analysis …, 2021 - ieeexplore.ieee.org
This paper attacks the challenging problem of video retrieval by text. In such a retrieval
paradigm, an end user searches for unlabeled videos by ad-hoc queries described …

Dual encoding for zero-example video retrieval

J Dong, X Li, C Xu, S Ji, Y He… - Proceedings of the …, 2019 - openaccess.thecvf.com
This paper attacks the challenging problem of zero-example video retrieval. In such a
retrieval paradigm, an end user searches for unlabeled videos by ad-hoc queries described …

Tree-augmented cross-modal encoding for complex-query video retrieval

X Yang, J Dong, Y Cao, X Wang, M Wang… - Proceedings of the 43rd …, 2020 - dl.acm.org
The rapid growth of user-generated videos on the Internet has intensified the need for text-
based video retrieval systems. Traditional methods mainly favor the concept-based …

W2vv++ fully deep learning for ad-hoc video search

X Li, C Xu, G Yang, Z Chen, J Dong - Proceedings of the 27th ACM …, 2019 - dl.acm.org
Ad-hoc video search (AVS) is an important yet challenging problem in multimedia retrieval.
Different from previous concept-based methods, we propose a fully deep learning method …

Hanet: Hierarchical alignment networks for video-text retrieval

P Wu, X He, M Tang, Y Lv, J Liu - Proceedings of the 29th ACM …, 2021 - dl.acm.org
Video-text retrieval is an important yet challenging task in vision-language understanding,
which aims to learn a joint embedding space where related video and text instances are …

A comprehensive review of the video-to-text problem

J Perez-Martin, B Bustos, SJF Guimaraes… - Artificial Intelligence …, 2022 - Springer
Research in the Vision and Language area encompasses challenging topics that seek to
connect visual and textual information. When the visual information is related to videos, this …

SEA: Sentence encoder assembly for video retrieval by textual queries

X Li, F Zhou, C Xu, J Ji, G Yang - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
Retrieving unlabeled videos by textual queries, known as Ad-hoc Video Search (AVS), is a
core theme in multimedia data management and retrieval. The success of AVS counts on …

Lightweight attentional feature fusion: A new baseline for text-to-video retrieval

F Hu, A Chen, Z Wang, F Zhou, J Dong, X Li - European conference on …, 2022 - Springer
In this paper we revisit feature fusion, an old-fashioned topic, in the new context of text-to-
video retrieval. Different from previous research that considers feature fusion only at one …

Multi-task paired masking with alignment modeling for medical vision-language pre-training

K Zhang, Y Yang, J Yu, H Jiang, J Fan… - IEEE Transactions …, 2023 - ieeexplore.ieee.org
In recent years, the growing demand for medical imaging diagnosis has placed a significant
burden on radiologists. As a solution, Medical Vision-Language Pre-training (Med-VLP) …

Neural ranking models for document retrieval

M Trabelsi, Z Chen, BD Davison, J Heflin - Information Retrieval Journal, 2021 - Springer
Ranking models are the main components of information retrieval systems. Several
approaches to ranking are based on traditional machine learning algorithms using a set of …