Deep CNN object features for improved action recognition in low quality videos

S Rahman, J See, CC Ho - Advanced Science Letters, 2017 - ingentaconnect.com
Advanced Science Letters, 2017ingentaconnect.com
Human action recognition from low quality video remains a challenging task for the action
recognition community. Recent state-of-the-art methods such as space-time interest point
(STIP) uses shape and motion features for characterization of action. However, STIP
features are over-reliant on video quality and lack robust object semantics. This paper
harness the robustness of deeply learned object features from off-the-shelf convolutional
neural network (CNN) models to improve action recognition under low quality conditions. A …
Human action recognition from low quality video remains a challenging task for the action recognition community. Recent state-of-the-art methods such as space-time interest point (STIP) uses shape and motion features for characterization of action. However, STIP features are over-reliant on video quality and lack robust object semantics. This paper harness the robustness of deeply learned object features from off-the-shelf convolutional neural network (CNN) models to improve action recognition under low quality conditions. A two-channel framework that aggregates shape and motion features extracted using STIP detector, and frame-level object features obtained from the final few layers (i.e., FC6, FC7, softmax layer) of a state-of-the-art image-trained CNN model is proposed. Experimental results on low quality versions of two publicly available datasets—UCF-11 and HMDB51, showed that the use of CNN object features together with conventional shape and motion can greatly improve the performance of action recognition in low quality videos.
ingentaconnect.com
以上显示的是最相近的搜索结果。 查看全部搜索结果