作者
Shunli Wang, Dingkang Yang, Peng Zhai, Chixiao Chen, Lihua Zhang
发表日期
2021/10/17
研讨会论文
ACM International Conference on Multimedia (ACM MM)
简介
In recent years, assessing action quality from videos has attracted growing attention in computer vision community and human-computer interaction. Most existing approaches usually tackle this problem by directly migrating the model from action recognition tasks, which ignores the intrinsic differences within the feature map such as foreground and background information. To address this issue, we propose a Tube Self-Attention Network (TSA-Net) for action quality assessment (AQA). Specifically, we introduce a single object tracker into AQA and propose the Tube Self-Attention Module (TSA), which can efficiently generate rich spatio-temporal contextual information by adopting sparse feature interactions. The TSA module is embedded in existing video networks to form TSA-Net. Overall, our TSA-Net is with the following merits: 1) High computational efficiency, 2) High flexibility, and 3) The state-of-the-art …
引用总数
学术搜索中的文章
S Wang, D Yang, P Zhai, C Chen, L Zhang - Proceedings of the 29th ACM international conference …, 2021