Humans view the world through many sensory channels, eg, the long-wavelength light channel, viewed by the left eye, or the high-frequency vibrations channel, heard by the right …
Research on depth-based human activity analysis achieved outstanding performance and demonstrated the effectiveness of 3D representation for action recognition. The existing …
L Jing, Y Tian - IEEE transactions on pattern analysis and …, 2020 - ieeexplore.ieee.org
Large-scale labeled data are generally required to train deep neural networks in order to obtain better performance in visual feature learning from images or videos for computer …
In this work, we propose a Cross-view Contrastive Learning framework for unsupervised 3D skeleton-based action representation (CrosSCLR), by leveraging multi-view complementary …
D Xu, J Xiao, Z Zhao, J Shao, D Xie… - Proceedings of the …, 2019 - openaccess.thecvf.com
We propose a self-supervised spatiotemporal learning technique which leverages the chronological order of videos. Our method can learn the spatiotemporal representation of …
Human activity recognition has been actively studied in the last three decades. Compared to human action performed by a single person, human interaction is more complex due to the …
Video action recognition is one of the representative tasks for video understanding. Over the last decade, we have witnessed great advancements in video action recognition thanks to …
VL Guen, N Thome - … of the IEEE/CVF conference on …, 2020 - openaccess.thecvf.com
Leveraging physical knowledge described by partial differential equations (PDEs) is an appealing way to improve unsupervised video forecasting models. Since physics is too …
S Yang, J Liu, S Lu, MH Er… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
Skeleton-based human action recognition has attracted increasing attention in recent years. However, most of the existing works focus on supervised learning which requiring a large …