End-to-end learning of motion representation for video understanding

L Fan, W Huang, C Gan, S Ermon… - Proceedings of the …, 2018 - openaccess.thecvf.com
Despite the recent success of end-to-end learned representations, hand-crafted optical flow
features are still widely used in video analysis tasks. To fill this gap, we propose TVNet, a …

Enhancing Human Activity Recognition Using Neural Networks in Video Classification

AJ SA, K Dey - Available at SSRN 4588199 - papers.ssrn.com
In this paper, the focus of this study is to explore two advanced deep learning
methodologies, specifically, Convolutional Long Short-Term Memory (ConvLSTM) networks …

A Novel Video Understanding Network Based on Poolformer and Transformer

S Shu, H Yu, J Yu - Proceedings of the 7th International Conference on …, 2023 - dl.acm.org
This paper introduces a new video understanding network referred to as' ViViP'(Video Vision
Poolformer), which leverages poolformer and transformer techniques. It begins by encoding …

Slowfast networks for video recognition

C Feichtenhofer, H Fan, J Malik… - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com
We present SlowFast networks for video recognition. Our model involves (i) a Slow pathway,
operating at low frame rate, to capture spatial semantics, and (ii) a Fast pathway, operating …

Video jigsaw: Unsupervised learning of spatiotemporal context for video action recognition

U Ahsan, R Madhok, I Essa - 2019 IEEE Winter Conference on …, 2019 - ieeexplore.ieee.org
We propose a self-supervised learning method to jointly reason about spatial and temporal
context for video recognition. Recent self-supervised approaches have used spatial context …

Look more but care less in video recognition

Y Zhang, Y Bai, H Wang, Y Xu… - Advances in Neural …, 2022 - proceedings.neurips.cc
Existing action recognition methods typically sample a few frames to represent each video to
avoid the enormous computation, which often limits the recognition performance. To tackle …

Towards Efficient and Effective Representation Learning for Image and Video Understanding

T Yang - 2023 - stars.library.ucf.edu
Deep learning has achieved tremendous success on various computer vision tasks.
However, deep learning methods and models are usually computationally expensive …

A multi-resolution fusion approach for human activity recognition from video data in tiny edge devices

S Nooruddin, MM Islam, F Karray, G Muhammad - Information Fusion, 2023 - Elsevier
Abstract Human Activity Recognition (HAR) is the process of automatic recognition of
Activities of Daily Living (ADL) from human motion data captured in various data modalities …

[引用][C] Corrections to “three-stream network with bidirectional self-attention for action recognition in extreme low resolution videos”

D Purwanto, RRA Pramono, YT Chen… - IEEE Signal …, 2020 - ieeexplore.ieee.org
Corrections to “Three-Stream Network With Bidirectional Self-Attention for Action
Recognition in Extreme Low Resoluti Page 1 2188 IEEE SIGNAL PROCESSING LETTERS …

Shuffle-invariant network for action recognition in videos

Q Shi, HB Zhang, Z Li, JX Du, Q Lei, JH Liu - ACM Transactions on …, 2022 - dl.acm.org
The local key features in video are important for improving the accuracy of human action
recognition. However, most end-to-end methods focus on global feature learning from …