Rethinking the video sampling and reasoning strategies for temporal sentence grounding

J Jang, J Park, J Kim, H Kwon… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Recent DETR-based video grounding models have made the model directly predict moment
timestamps without any hand-crafted components, such as a pre-defined proposal or non …

被引用次数：51 相关文章所有 8 个版本

[PDF] pkwyx.com

Rethinking weakly-supervised video temporal grounding from a game perspective

X Fang, Z Xiong, W Fang, X Qu, C Chen, J Dong… - … on Computer Vision, 2025 - Springer

This paper addresses the challenging task of weakly-supervised video temporal grounding.
Existing approaches are generally based on the moment proposal selection framework that …

被引用次数：10 相关文章所有 4 个版本

Correlation-guided query-dependency calibration in video representation learning for temporal grounding

WJ Moon, S Hyun, SB Lee, JP Heo - CoRR, 2023 - openreview.net

Temporal Grounding is to identify specific moments or highlights from a video corresponding
to textual descriptions. Typical approaches in temporal grounding treat all video clips …

被引用次数：33 相关文章所有 2 个版本

Collaborative debias strategy for temporal sentence grounding in video

Z Qi, Y Yuan, X Ruan, S Wang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Temporal sentence grounding in video has witnessed significant advancements, but suffers
from substantial dataset bias, which undermines its generalization ability. Existing debias …

被引用次数：4 相关文章

[PDF] google.com

Filling the Information Gap between Video and Query for Language-Driven Moment Retrieval

D Liu, X Qu, J Dong, G Nan, P Zhou, Z Xu… - Proceedings of the 31st …, 2023 - dl.acm.org

This paper addresses the challenging task of language-driven moment retrieval. Previous
methods are typically trained to localize the target moment corresponding to a single …

被引用次数：8 相关文章所有 2 个版本

[PDF] thecvf.com

SnAG: Scalable and Accurate Video Grounding

F Mu, S Mo, Y Li - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com

Temporal grounding of text descriptions in videos is a central problem in vision-language
learning and video understanding. Existing methods often prioritize accuracy over scalability …

被引用次数：6 相关文章所有 3 个版本

Conditional Video Diffusion Network for Fine-grained Temporal Sentence Grounding

D Liu, J Zhu, X Fang, Z Xiong, H Wang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Temporal sentence grounding (TSG) aims to locate a semantically related segment of an
untrimmed video guided by a sentence query. Since the untrimmed videos are too long …

被引用次数：11 相关文章

[PDF] arxiv.org

Transform-Equivariant Consistency Learning for Temporal Sentence Grounding

D Liu, X Qu, J Dong, P Zhou, Z Xu, H Wang… - ACM Transactions on …, 2024 - dl.acm.org

This paper addresses the temporal sentence grounding (TSG). Although existing methods
have made decent achievements in this task, they not only severely rely on abundant video …

被引用次数：9 相关文章所有 3 个版本

[PDF] aaai.org

Unsupervised Domain Adaptative Temporal Sentence Localization with Mutual Information Maximization

D Liu, X Fang, X Qu, J Dong, H Yan, Y Yang… - Proceedings of the …, 2024 - ojs.aaai.org

Temporal sentence localization (TSL) aims to localize a target segment in a video according
to a given sentence query. Though respectable works have made decent achievements in …

被引用次数：6 相关文章

[PDF] arxiv.org

Tracking Objects and Activities with Attention for Temporal Sentence Grounding

Z Xiong, D Liu, P Zhou, J Zhu - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org

Temporal sentence grounding (TSG) aims to localize the temporal segment which is
semantically aligned with a natural language query in an untrimmed video. Most existing …

被引用次数：6 相关文章所有 3 个版本

高级搜索

QQ 群