Towards bridging event captioner and sentence localizer for weakly supervised dense event captioning

S Chen, YG Jiang - … of the IEEE/CVF Conference on …, 2021 - openaccess.thecvf.com
Abstract Dense Event Captioning (DEC) aims to jointly localize and describe multiple events
of interest in untrimmed videos, which is an advancement of the conventional video …

Weakly supervised dense event captioning in videos

X Duan, W Huang, C Gan, J Wang… - Advances in Neural …, 2018 - proceedings.neurips.cc
Dense event captioning aims to detect and describe all events of interest contained in a
video. Despite the advanced development in this area, existing methods tackle this task by …

Event-centric hierarchical representation for dense video captioning

T Wang, H Zheng, M Yu, Q Tian… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org
Dense video captioning aims to localize and describe multiple events in untrimmed videos,
which is a challenging task that draws attention recently in computer vision. Although …

Sketch, ground, and refine: Top-down dense video captioning

C Deng, S Chen, D Chen, Y He… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
The dense video captioning task aims to detect and describe a sequence of events in a
video for detailed and coherent storytelling. Previous works mainly adopt a" detect-then …

Unifying event detection and captioning as sequence generation via pre-training

Q Zhang, Y Song, Q Jin - European Conference on Computer Vision, 2022 - Springer
Dense video captioning aims to generate corresponding text descriptions for a series of
events in the untrimmed video, which can be divided into two sub-tasks, event detection and …

Bidirectional attentive fusion with context gating for dense video captioning

J Wang, W Jiang, L Ma, W Liu… - Proceedings of the IEEE …, 2018 - openaccess.thecvf.com
Dense video captioning is a newly emerging task that aims at both localizing and describing
all events in a video. We identify and tackle two challenges on this task, namely,(1) how to …

Multi-modal dense video captioning

V Iashin, E Rahtu - … of the IEEE/CVF conference on …, 2020 - openaccess.thecvf.com
Dense video captioning is a task of localizing interesting events from an untrimmed video
and producing textual description (captions) for each localized event. Most of the previous …

Streamlined dense video captioning

J Mun, L Yang, Z Ren, N Xu… - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com
Dense video captioning is an extremely challenging task since accurate and coherent
description of events in a video requires holistic understanding of video contents as well as …

End-to-end dense video captioning with parallel decoding

T Wang, R Zhang, Z Lu, F Zheng… - Proceedings of the …, 2021 - openaccess.thecvf.com
Dense video captioning aims to generate multiple associated captions with their temporal
locations from the video. Previous methods follow a sophisticated" localize-then-describe" …

Jointly localizing and describing events for dense video captioning

Y Li, T Yao, Y Pan, H Chao… - Proceedings of the IEEE …, 2018 - openaccess.thecvf.com
Automatically describing a video with natural language is regarded as a fundamental
challenge in computer vision. The problem nevertheless is not trivial especially when a …