What can a cook in Italy teach a mechanic in India? Action Recognition Generalisation Over Scenarios and Locations

C Plizzari, T Perrett, B Caputo… - Proceedings of the …, 2023 - openaccess.thecvf.com
We propose and address a new generalisation problem: can a model trained for action
recognition successfully classify actions when they are performed within a previously …

Mitigating representation bias in action recognition: Algorithms and benchmarks

H Duan, Y Zhao, K Chen, Y Xiong, D Lin - European Conference on …, 2022 - Springer
Deep learning models have achieved excellent recognition results on large-scale video
benchmarks. However, they perform poorly when applied to videos with rare scenes or …

Nuta: Non-uniform temporal aggregation for action recognition

X Li, C Liu, B Shuai, Y Zhu, H Chen… - Proceedings of the …, 2022 - openaccess.thecvf.com
In the world of action recognition research, one primary focus has been on how to construct
and train networks to model the spatial-temporal volume of an input video. These methods …

How can objects help action recognition?

X Zhou, A Arnab, C Sun… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Current state-of-the-art video models process a video clip as a long sequence of spatio-
temporal tokens. However, they do not explicitly model objects, their interactions across the …

Interact before align: Leveraging cross-modal knowledge for domain adaptive action recognition

L Yang, Y Huang, Y Sugano… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
Unsupervised domain adaptive video action recognition aims to recognize actions of a
target domain using a model trained with only out-of-domain (source) annotations. The …

LgNet: A local-global network for action recognition and beyond

J Zhou, Z Fu, Q Huang, Q Liu… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
This work addresses the task of action recognition in video sequences. In real world
applications, this task is quite challenging due to the complex background of video content …

Multitask learning to improve egocentric action recognition

G Kapidis, R Poppe, E Van Dam… - Proceedings of the …, 2019 - openaccess.thecvf.com
In this work we employ multitask learning to capitalize on the structure that exists in related
supervised tasks to train complex neural networks. It allows training a network for multiple …

Actionclip: Adapting language-image pretrained models for video action recognition

M Wang, J Xing, J Mei, Y Liu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
The canonical approach to video action recognition dictates a neural network model to do a
classic and standard 1-of-N majority vote task. They are trained to predict a fixed set of …

A large-scale study of spatiotemporal representation learning with a new benchmark on action recognition

A Deng, T Yang, C Chen - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
The goal of building a benchmark (suite of datasets) is to provide a unified protocol for fair
evaluation and thus facilitate the evolution of a specific area. Nonetheless, we point out that …

Action2vec: A crossmodal embedding approach to action learning

M Hahn, A Silva, JM Rehg - arXiv preprint arXiv:1901.00484, 2019 - arxiv.org
We describe a novel cross-modal embedding space for actions, named Action2Vec, which
combines linguistic cues from class labels with spatio-temporal features derived from video …