Spatiotemporal and motion features are two complementary and crucial information for video action recognition. Recent state-of-the-art methods adopt a 3D CNN stream to learn …
A Mihanpour, MJ Rashti, SE Alavi - 2020 6th International …, 2020 - ieeexplore.ieee.org
Human action recognition in video is one of the most widely applied topics in the field of image and video processing, with many applications in surveillance (security, sports, etc.) …
A Zhou, Y Ma, W Ji, M Zong, P Yang, M Wu, M Liu - Multimedia Systems, 2023 - Springer
Recent years have witnessed the popularity of using two-stream convolutional neural networks for action recognition. However, existing two-stream convolutional neural network …
T Yu, C Guo, L Wang, H Gu, S Xiang, C Pan - Pattern Recognition Letters, 2018 - Elsevier
In this paper, we propose a novel high-level action representation using joint spatial- temporal attention model, with application to video-based human action recognition …
J Yu, H Gao, W Yang, Y Jiang, W Chin, N Kubota… - IEEE …, 2020 - ieeexplore.ieee.org
Activity recognition which aims to accurately distinguish human actions in complex environments plays a key role in human-robot/computer interaction. However, long-lasting …
J Chen, Y Xu, C Zhang, Z Xu, X Meng… - … on automation and …, 2019 - ieeexplore.ieee.org
In order to obtain global contextual information precisely from videos with heavy camera motions and scene changes, this study proposes an improved spatiotemporal two-stream …
This paper addresses the issue of video-based action recognition by exploiting an advanced multistream convolutional neural network (CNN) to fully use semantics-derived multiple …
A Sarabu, AK Santra - Emerging Science Journal, 2021 - research.vit.ac.in
Two-stream convolutional networks plays an essential role as a powerful feature extractor in human action recognition in videos. Recent studies have shown the importance of two …
Recently, deep convolutional networks (ConvNets) have achieved remarkable progress for action recognition in videos. Most existing deep frameworks treat a video as an unordered …