G Wu, T Xu, J Zhang - Computer Vision and Image Understanding, 2024 - Elsevier
Temporal language localization in videos aims to retrieve the moment that best matches the
text description in the untrimmed video using the query text. Existing methods using graph …