Long-form video-language pre-training with multimodal temporal contrastive learning

Y Sun, H Xue, R Song, B Liu… - Advances in neural …, 2022 - proceedings.neurips.cc
Large-scale video-language pre-training has shown significant improvement in video-
language understanding tasks. Previous studies of video-language pretraining mainly focus …

Efficient movie scene detection using state-space transformers

MM Islam, M Hasan, KS Athrey… - Proceedings of the …, 2023 - openaccess.thecvf.com
The ability to distinguish between different movie scenes is critical for understanding the
storyline of a movie. However, accurately detecting movie scenes is often challenging as it …

Hsa-rnn: Hierarchical structure-adaptive rnn for video summarization

B Zhao, X Li, X Lu - Proceedings of the IEEE conference on …, 2018 - openaccess.thecvf.com
Although video summarization has achieved great success in recent years, few approaches
have realized the influence of video structure on the summarization results. As we know, the …

Hierarchical boundary-aware neural encoder for video captioning

L Baraldi, C Grana, R Cucchiara - Proceedings of the IEEE …, 2017 - openaccess.thecvf.com
Abstract The use of Recurrent Neural Networks for video captioning has recently gained a
lot of attention, since they can be used both to encode the input video and to generate the …

Transnet v2: An effective deep network architecture for fast shot transition detection

T Souček, J Lokoč - arXiv preprint arXiv:2008.04838, 2020 - arxiv.org
Although automatic shot transition detection approaches are already investigated for more
than two decades, an effective universal human-level model was not proposed yet. Even for …

A novel key-frames selection framework for comprehensive video summarization

C Huang, H Wang - IEEE Transactions on Circuits and Systems …, 2019 - ieeexplore.ieee.org
Video summarization (VSUMM) has become a popular method in processing massive video
data. The key point of VSUMM is to select the key frames to represent the effective contents …

Generic event boundary detection: A benchmark for event segmentation

MZ Shou, SW Lei, W Wang… - Proceedings of the …, 2021 - openaccess.thecvf.com
This paper presents a novel task together with a new benchmark for detecting generic,
taxonomy-free event boundaries that segment a whole video into chunks. Conventional …

Automated Visual Content Analysis for Film Studies: Current Status and Challenges.

K Pustu-Iren, J Sittel, R Mauer… - DHQ: Digital …, 2020 - search.ebscohost.com
Lots of approaches for automated video analysis have been suggested since the 1990ies,
which have the potential to support quantitative and qualitative analysis in film studies …

Ridiculously fast shot boundary detection with fully convolutional neural networks

M Gygli - 2018 International conference on content-based …, 2018 - ieeexplore.ieee.org
Shot boundary detection (SBD) is an important component of many video analysis tasks,
such as action recognition'video indexing, summarization and editing. Previous work …

Newsnet: A novel dataset for hierarchical temporal segmentation

H Wu, K Chen, H Liu, M Zhuge, B Li… - Proceedings of the …, 2023 - openaccess.thecvf.com
Temporal video segmentation is the get-to-go automatic video analysis, which decomposes
a long-form video into smaller components for the following-up understanding tasks. Recent …