Attention-based temporal weighted convolutional neural network for action recognition

J Zang, L Wang, Z Liu, Q Zhang, G Hua… - … and Innovations: 14th IFIP …, 2018 - Springer
Research in human action recognition has accelerated significantly since the introduction of
powerful machine learning tools such as Convolutional Neural Networks (CNNs). However …

More is less: Learning efficient video representations by big-little network and depthwise temporal aggregation

Q Fan, CFR Chen, H Kuehne… - Advances in Neural …, 2019 - proceedings.neurips.cc
Current state-of-the-art models for video action recognition are mostly based on expensive
3D ConvNets. This results in a need for large GPU clusters to train and evaluate such …

Embedding sequential information into spatiotemporal features for action recognition

Y Ye, Y Tian - Proceedings of the IEEE conference on computer …, 2016 - cv-foundation.org
In this paper, we introduce a novel framework for video-based action recognition, In this
paper, we introduce a novel framework for video-based action recognition, which …

Two-stream convolutional networks for action recognition in videos

K Simonyan, A Zisserman - Advances in neural information …, 2014 - proceedings.neurips.cc
We investigate architectures of discriminatively trained deep Convolutional Networks
(ConvNets) for action recognition in video. The challenge is to capture the complementary …

Real-time action recognition with deeply transferred motion vector cnns

B Zhang, L Wang, Z Wang, Y Qiao… - IEEE Transactions on …, 2018 - ieeexplore.ieee.org
The two-stream CNNs prove very successful for video-based action recognition. However,
the classical two-stream CNNs are time costly, mainly due to the bottleneck of calculating …

Second-order temporal pooling for action recognition

A Cherian, S Gould - International Journal of Computer Vision, 2019 - Springer
Deep learning models for video-based action recognition usually generate features for short
clips (consisting of a few frames); such clip-level features are aggregated to video-level …

Improving action recognition via temporal and complementary learning

NE Elmadany, Y He, L Guan - ACM Transactions on Intelligent Systems …, 2021 - dl.acm.org
In this article, we study the problem of video-based action recognition. We improve the
action recognition performance by finding an effective temporal and appearance …

Unified spatio-temporal attention networks for action recognition in videos

D Li, T Yao, LY Duan, T Mei… - IEEE Transactions on …, 2018 - ieeexplore.ieee.org
Recognizing actions in videos is not a trivial task because video is an information-intensive
media and includes multiple modalities. Moreover, on each modality, an action may only …

Two-stream convolution neural network with video-stream for action recognition

W Dai, Y Chen, C Huang, M Gao… - 2019 International Joint …, 2019 - ieeexplore.ieee.org
Recently, as the application of the convolutional neural network in artificial intelligence is
becoming increasingly diversified, a growing number of neural network methods are put …

Distinct two-stream convolutional networks for human action recognition in videos using segment-based temporal modeling

A Sarabu, AK Santra - Data, 2020 - mdpi.com
The Two-stream convolution neural network (CNN) has proven a great success in action
recognition in videos. The main idea is to train the two CNNs in order to learn spatial and …