Unsupervised temporal video grounding with deep semantic clustering

H Zhang, A Sun, W Jing, JT Zhou - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Temporal sentence grounding in videos (TSGV), aka, natural language video localization
(NLVL) or video moment retrieval (VMR), aims to retrieve a temporal moment that …

被引用次数：40 相关文章所有 8 个版本

[PDF] thecvf.com

You can ground earlier than see: An effective and efficient pipeline for temporal sentence grounding in compressed videos

X Fang, D Liu, P Zhou, G Nan - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Given an untrimmed video, temporal sentence grounding (TSG) aims to locate a target
moment semantically according to a sentence query. Although previous respectable works …

被引用次数：31 相关文章所有 7 个版本

[PDF] aaai.org

Memory-guided semantic learning network for temporal sentence grounding

D Liu, X Qu, X Di, Y Cheng, Z Xu, P Zhou - Proceedings of the AAAI …, 2022 - ojs.aaai.org

Temporal sentence grounding (TSG) is crucial and fundamental for video understanding.
Although existing methods train well-designed deep networks with large amount of data, we …

被引用次数：53 相关文章所有 6 个版本

[PDF] arxiv.org

Reducing the vision and language bias for temporal sentence grounding

D Liu, X Qu, W Hu - Proceedings of the 30th ACM International …, 2022 - dl.acm.org

Temporal sentence grounding (TSG) is an important yet challenging task in multimedia
information retrieval. Although previous TSG methods have achieved decent performance …

被引用次数：42 相关文章所有 3 个版本

[PDF] arxiv.org

Skimming, locating, then perusing: A human-like framework for natural language video localization

D Liu, W Hu - Proceedings of the 30th ACM International Conference …, 2022 - dl.acm.org

This paper addresses the problem of natural language video localization (NLVL). Almost all
existing works follow the" only look once" framework that exploits a single model to directly …

被引用次数：35 相关文章所有 3 个版本

[PDF] arxiv.org

Multi-modal cross-domain alignment network for video moment retrieval

X Fang, D Liu, P Zhou, Y Hu - IEEE Transactions on Multimedia, 2022 - ieeexplore.ieee.org

As an increasingly popular task in multimedia information retrieval, video moment retrieval
(VMR) aims to localize the target moment from an untrimmed video according to a given …

被引用次数：35 相关文章所有 4 个版本

Zero-shot video grounding with pseudo query lookup and verification

Y Lu, R Quan, L Zhu, Y Yang - IEEE Transactions on Image …, 2024 - ieeexplore.ieee.org

Video grounding, the process of identifying a specific moment in an untrimmed video based
on a natural language query, has become a popular topic in video understanding. However …

被引用次数：8 相关文章所有 5 个版本

[PDF] aaai.org

Hierarchical contrast for unsupervised skeleton-based action representation learning

J Dong, S Sun, Z Liu, S Chen, B Liu… - Proceedings of the AAAI …, 2023 - ojs.aaai.org

This paper targets unsupervised skeleton-based action representation learning and
proposes a new Hierarchical Contrast (HiCo) framework. Different from the existing …

被引用次数：28 相关文章所有 4 个版本

[PDF] aaai.org

Exploring motion and appearance information for temporal sentence grounding

D Liu, X Qu, P Zhou, Y Liu - Proceedings of the AAAI Conference on …, 2022 - ojs.aaai.org

This paper addresses temporal sentence grounding. Previous works typically solve this task
by learning frame-level video features and align them with the textual information. A major …

被引用次数：40 相关文章所有 5 个版本

[PDF] arxiv.org

Hierarchical local-global transformer for temporal sentence grounding

X Fang, D Liu, P Zhou, Z Xu, R Li - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

This article studies the multimedia problem of temporal sentence grounding (TSG), which
aims to accurately determine the specific video segment in an untrimmed video according to …

被引用次数：29 相关文章所有 4 个版本

高级搜索

QQ 群