Z Zheng, G An, D Wu, Q Ruan - Neurocomputing, 2019 - Elsevier
Abstract Convolutional Neural Networks (CNNs) usually use top-level appearance features of video frames for action recognition. However, these methods discard the implicit …
We present a general and flexible video-level framework for learning action models in videos. This method, called temporal segment network (TSN), aims to model long-range …
M Zhang, Y Yang, Y Ji, N Xie, F Shen - Signal Processing, 2018 - Elsevier
Action recognition in videos, which contains many complex and semantic contents, is still a challenging task in computer vision research. In this paper, we propose a novel attention …
Y Li, B Ji, X Shi, J Zhang, B Kang… - Proceedings of the …, 2020 - openaccess.thecvf.com
Temporal modeling is key for action recognition in videos. It normally considers both short- range motions and long-range aggregations. In this paper, we propose a Temporal …
Spatiotemporal and motion features are two complementary and crucial information for video action recognition. Recent state-of-the-art methods adopt a 3D CNN stream to learn …
Action recognition in videos is a challenging task due to the complexity of the spatio- temporal patterns to model and the difficulty to acquire and learn on large quantities of video …
W Du, Y Wang, Y Qiao - IEEE Transactions on Image …, 2017 - ieeexplore.ieee.org
Recent years have witnessed the popularity of using recurrent neural network (RNN) for action recognition in videos. However, videos are of high dimensionality and contain rich …
W Dong, Z Zhang, T Tan - Proceedings of the AAAI Conference on Artificial …, 2019 - aaai.org
Deep learning based methods have achieved remarkable progress in action recognition. Existing works mainly focus on designing novel deep architectures to achieve video …
YD Zheng, Z Liu, T Lu, L Wang - IEEE transactions on image …, 2020 - ieeexplore.ieee.org
The existing action recognition methods are mainly based on clip-level classifiers such as two-stream CNNs or 3D CNNs, which are trained from the randomly selected clips and …