Unsupervised video summarization via relation-aware assignment learning

J Dong, S Sun, Z Liu, S Chen, B Liu… - Proceedings of the AAAI …, 2023 - ojs.aaai.org

This paper targets unsupervised skeleton-based action representation learning and
proposes a new Hierarchical Contrast (HiCo) framework. Different from the existing …

被引用次数：41 相关文章所有 4 个版本

[PDF] arxiv.org

Audiovisual video summarization

B Zhao, M Gong, X Li - IEEE Transactions on Neural Networks …, 2021 - ieeexplore.ieee.org

Audio and vision are two main modalities in video data. Multimodal learning, especially for
audiovisual learning, has drawn considerable attention recently, which can boost the …

被引用次数：44 相关文章所有 5 个版本

Learning dual-routing capsule graph neural network for few-shot video classification

Y Feng, J Gao, C Xu - IEEE Transactions on Multimedia, 2022 - ieeexplore.ieee.org

Few-shot video classification (video FSL), which learns classifiers for novel concepts, has
gained increasing attention in the last few years from only a few samples. The existing …

被引用次数：19 相关文章所有 2 个版本

[PDF] thecvf.com

Steps: Self-supervised key step extraction and localization from unlabeled procedural videos

A Shah, B Lundell, H Sawhney… - Proceedings of the …, 2023 - openaccess.thecvf.com

We address the problem of extracting key steps from unlabeled procedural videos,
motivated by the potential of Augmented Reality (AR) headsets to revolutionize job training …

被引用次数：8 相关文章所有 7 个版本

[PDF] ieee.org

Condensing Video Content: Deep Learning Advancements and Challenges in Video Summarization Innovations

F Shamsi, I Sindhu - IEEE Access, 2025 - ieeexplore.ieee.org

With the rapid growth of social media platforms, the volume of video content on the internet
has increased exponentially. YouTube, the most popular social networking platform …

Spatiotemporal Orthogonal Projection Capsule Network for Incremental Few-Shot Action Recognition

Y Feng, J Gao, C Xu - IEEE Transactions on Multimedia, 2024 - ieeexplore.ieee.org

In this paper, we propose a new task named incremental few-shot action recognition
(IFSAR), which aims to learn new action classes incrementally with limited samples. Existing …

被引用次数：4 相关文章

Spatial-temporal exclusive capsule network for open set action recognition

Y Feng, J Gao, S Yang, C Xu - IEEE Transactions on Multimedia, 2023 - ieeexplore.ieee.org

Open set action recognition (OSAR) is a rising research domain that simultaneously
identifies all videos from known classes and rejects videos from unknown classes. Existing …

被引用次数：7 相关文章所有 2 个版本

[PDF] arxiv.org

Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model

Y Huang, J Xu, B Pei, Y He, G Chen, L Yang… - arXiv preprint arXiv …, 2024 - arxiv.org

We introduce Vinci, a real-time embodied smart assistant built upon an egocentric vision-
language model. Designed for deployment on portable devices such as smartphones and …

被引用次数：1 相关文章

Emotion knowledge driven video highlight detection

F Qi, X Yang, C Xu - IEEE Transactions on Multimedia, 2020 - ieeexplore.ieee.org

This paper addresses video highlight detection which aims to select a small subset of frames
according to user's major or special interest. The performances of conventional methods …

被引用次数：21 相关文章

Learning scene-aware spatio-temporal GNNs for few-shot early action prediction

Y Hu, J Gao, C Xu - IEEE Transactions on Multimedia, 2022 - ieeexplore.ieee.org

We aim to address a new task named few-shot early action prediction (FS-EAP) that learns
classifiers for novel actions from only a few partially observed videos. We argue that the task …

被引用次数：12 相关文章所有 2 个版本

高级搜索

QQ 群