We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite. It offers 3,670 hours of daily-life activity video spanning hundreds of scenarios (household …
Abstract Video-Language Pretraining (VLP), which aims to learn transferable representation to advance a wide range of video-text downstream tasks, has recently received increasing …
KQ Lin, P Zhang, J Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract Video Temporal Grounding (VTG), which aims to ground target clips from videos (such as consecutive intervals or disjoint shots) according to custom language queries (eg …
P Li, CW Xie, H Xie, L Zhao, L Zhang… - Advances in neural …, 2024 - proceedings.neurips.cc
Video moment retrieval pursues an efficient and generalized solution to identify the specific temporal segments within an untrimmed video that correspond to a given language …
Temporal sentence grounding in videos (TSGV), aka, natural language video localization (NLVL) or video moment retrieval (VMR), aims to retrieve a temporal moment that …
J Lei, TL Berg, M Bansal - Advances in Neural Information …, 2021 - proceedings.neurips.cc
Detecting customized moments and highlights from videos given natural language (NL) user queries is an important but under-studied topic. One of the challenges in pursuing this …
WJ Moon, S Hyun, SU Park, D Park… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recently, video moment retrieval and highlight detection (MR/HD) are being spotlighted as the demand for video understanding is drastically increased. The key objective of MR/HD is …
J Tan, J Tang, L Wang, G Wu - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
Temporal action proposal generation is an important and challenging task in video understanding, which aims at detecting all temporal segments containing action instances of …
We consider the problem of localizing a spatio-temporal tube in a video corresponding to a given text query. This is a challenging task that requires the joint and efficient modeling of …