Mar: Masked autoencoders for efficient action recognition

Z Qing, S Zhang, Z Huang, X Wang… - IEEE Transactions …, 2023 - ieeexplore.ieee.org
Standard approaches for video action recognition usually operate on full input videos, which
is inefficient due to the widespread spatio-temporal redundancy in videos. The recent …

Twinformer: Fine-to-coarse temporal modeling for long-term action recognition

J Zhou, KY Lin, YK Qiu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
The long-term action in untrimmed video generally contains multiple sub-actions, among
which various semantic patterns exist (eg, the co-occurrence or sequentiality between sub …

ViGAT: Bottom-up event recognition and explanation in video using factorized graph attention network

N Gkalelis, D Daskalakis, V Mezaris - IEEE Access, 2022 - ieeexplore.ieee.org
In this paper a pure-attention bottom-up approach, called ViGAT, that utilizes an object
detector together with a Vision Transformer (ViT) backbone network to derive object and …

Gated-ViGAT: Efficient bottom-up event recognition and explanation using a new frame selection policy and gating mechanism

N Gkalelis, D Daskalakis… - 2022 IEEE International …, 2022 - ieeexplore.ieee.org
In this paper, Gated-ViGAT, an efficient approach for video event recognition, utilizing bottom-
up (object) information, a new frame sampling policy and a gating mechanism is proposed …

A BERT-Based Joint Channel-Temporal Modeling for Action Recognition

M Yang, L Gan, R Cao, X Li - IEEE Sensors Journal, 2023 - ieeexplore.ieee.org
Action recognition provides an application for human action classification utilizing datasets
captured by various sensor cameras. However, how to capture the key semantic features …

Recognizing Video Activities in the Wild via View-to-Scene Joint Learning

J Yu, Y Chen, X Wang, X Cheng… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Recognizing video actions in the wild is challenging for visual control systems. In-the-wild
videos show actions not seen in training data, recorded from various angles and scenes with …

Differential motion attention network for efficient action recognition

C Liu, F Gu - The Visual Computer, 2024 - Springer
Despite the great progresses achieved by commonly-used 3D CNNs and two-stream
methods in action recognition, they cause heavy computational burden which are inefficient …