相关文章- 学术资源搜索

Video transformer network

D Neimark, O Bar, M Zohar… - Proceedings of the …, 2021 - openaccess.thecvf.com

This paper presents VTN, a transformer-based framework for video recognition. Inspired by
recent developments in vision transformers, we ditch the standard approach in video action …

被引用次数：506 相关文章所有 9 个版本

[PDF] arxiv.org

Temporal segment networks for action recognition in videos

L Wang, Y Xiong, Z Wang, Y Qiao, D Lin… - IEEE transactions on …, 2018 - ieeexplore.ieee.org

We present a general and flexible video-level framework for learning action models in
videos. This method, called temporal segment network (TSN), aims to model long-range …

被引用次数：877 相关文章所有 13 个版本

[PDF] thecvf.com

Tdn: Temporal difference networks for efficient action recognition

L Wang, Z Tong, B Ji, G Wu - Proceedings of the IEEE/CVF …, 2021 - openaccess.thecvf.com

Temporal modeling still remains challenging for action recognition in videos. To mitigate this
issue, this paper presents a new video architecture, termed as Temporal Difference Network …

被引用次数：429 相关文章所有 8 个版本

Going deeper with two-stream ConvNets for action recognition in video surveillance

Y Han, P Zhang, T Zhuo, W Huang, Y Zhang - Pattern Recognition Letters, 2018 - Elsevier

Learning by deep convolutional networks have shown an outstanding effectiveness in a
variety of vision based classification tasks, and for which, large datasets are the …

被引用次数：88 相关文章所有 3 个版本

[PDF] thecvf.com

Mm-vit: Multi-modal video transformer for compressed video action recognition

J Chen, CM Ho - Proceedings of the IEEE/CVF winter …, 2022 - openaccess.thecvf.com

This paper presents a pure transformer-based approach, dubbed the Multi-Modal Video
Transformer (MM-ViT), for video action recognition. Different from other schemes which …

被引用次数：102 相关文章所有 6 个版本

TEN: temporal excitation network for video action recognition

D Sun, Z He, B Luo, Z Ding - International Conference on …, 2023 - spiedigitallibrary.org

Temporal modeling has attracted the attention of a large number of researchers in the past
few years. In this work, we propose a new video architecture, termed as Temporal Excitation …

Diverse features fusion network for video-based action recognition

H Deng, J Kong, M Jiang, T Liu - Journal of Visual Communication and …, 2021 - Elsevier

The two-stream convolutional network has been proved to be one milestone in the study of
video-based action recognition. Lots of recent works modify internal structure of two-stream …

被引用次数：7 相关文章

MV2Flow: Learning motion representation for fast compressed video action recognition

H Hu, W Zhou, X Li, N Yan, H Li - ACM Transactions on Multimedia …, 2020 - dl.acm.org

In video action recognition, motion is a very crucial clue, which is usually represented by
optical flow. However, optical flow is computationally expensive to obtain, which becomes …

被引用次数：20 相关文章

[PDF] mdpi.com

Optimal Topology of Vision Transformer for Real-Time Video Action Recognition in an End-To-End Cloud Solution

S Sarraf, M Kabia - Machine Learning and Knowledge Extraction, 2023 - mdpi.com

This study introduces an optimal topology of vision transformers for real-time video action
recognition in a cloud-based solution. Although model performance is a key criterion for real …

被引用次数：2 相关文章所有 3 个版本

[PDF] unimore.it

Towards practical compressed video action recognition: A temporal enhanced multi-stream network

B Li, L Kong, D Zhang, X Bao… - … conference on pattern …, 2021 - ieeexplore.ieee.org

Current compressed video action recognition methods are mainly based on complete data.
However, in a real transmission scenario, the compressed video packets are usually …

被引用次数：12 相关文章所有 4 个版本

高级搜索

QQ 群