Actionclip: Adapting language-image pretrained models for video action recognition

M Wang, J Xing, J Mei, Y Liu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
The canonical approach to video action recognition dictates a neural network model to do a
classic and standard 1-of-N majority vote task. They are trained to predict a fixed set of …

ActionCLIP: Adapting Language-Image Pretrained Models for Video Action Recognition

M Wang, J Xing, J Mei, Y Liu… - IEEE transactions on … - pubmed.ncbi.nlm.nih.gov
The canonical approach to video action recognition dictates a neural network model to do a
classic and standard 1-of-N majority vote task. They are trained to predict a fixed set of …