相关文章- 学术资源搜索

Dynamic normalization and relay for video action recognition

D Cai, A Yao, Y Chen - Advances in neural information …, 2021 - proceedings.neurips.cc

Abstract Convolutional Neural Networks (CNNs) have been the dominant model for video
action recognition. Due to the huge memory and compute demand, popular action …

被引用次数：5 相关文章所有 6 个版本

[PDF] neurips.cc

More is less: Learning efficient video representations by big-little network and depthwise temporal aggregation

Q Fan, CFR Chen, H Kuehne… - Advances in Neural …, 2019 - proceedings.neurips.cc

Current state-of-the-art models for video action recognition are mostly based on expensive
3D ConvNets. This results in a need for large GPU clusters to train and evaluate such …

被引用次数：143 相关文章所有 9 个版本

[PDF] thecvf.com

Deep analysis of cnn-based spatio-temporal representations for action recognition

CFR Chen, R Panda… - Proceedings of the …, 2021 - openaccess.thecvf.com

In recent years, a number of approaches based on 2D or 3D convolutional neural networks
(CNN) have emerged for video action recognition, achieving state-of-the-art results on …

被引用次数：117 相关文章所有 8 个版本

[PDF] thecvf.com

A large-scale robustness analysis of video action recognition models

MC Schiappa, N Biyani, P Kamtam… - Proceedings of the …, 2023 - openaccess.thecvf.com

We have seen great progress in video action recognition in recent years. There are several
models based on convolutional neural network (CNN) and some recent transformer based …

被引用次数：17 相关文章所有 6 个版本

[PDF] arxiv.org

Mitigating representation bias in action recognition: Algorithms and benchmarks

H Duan, Y Zhao, K Chen, Y Xiong, D Lin - European Conference on …, 2022 - Springer

Deep learning models have achieved excellent recognition results on large-scale video
benchmarks. However, they perform poorly when applied to videos with rare scenes or …

被引用次数：4 相关文章所有 6 个版本

[PDF] github.io

Mfi: Multi-range feature interchange for video action recognition

S Bai, Q Wang, X Li - 2020 25th International Conference on …, 2021 - ieeexplore.ieee.org

Short-range motion features and long-range dependencies are two complementary and vital
cues for action recognition in videos, but it remains unclear how to efficiently and effectively …

被引用次数：7 相关文章所有 6 个版本

[PDF] thecvf.com

Gate-shift networks for video action recognition

S Sudhakaran, S Escalera… - Proceedings of the IEEE …, 2020 - openaccess.thecvf.com

Deep 3D CNNs for video action recognition are designed to learn powerful representations
in the joint spatio-temporal feature space. In practice however, because of the large number …

被引用次数：185 相关文章所有 14 个版本

[PDF] arxiv.org

Video jigsaw: Unsupervised learning of spatiotemporal context for video action recognition

U Ahsan, R Madhok, I Essa - 2019 IEEE Winter Conference on …, 2019 - ieeexplore.ieee.org

We propose a self-supervised learning method to jointly reason about spatial and temporal
context for video recognition. Recent self-supervised approaches have used spatial context …

被引用次数：131 相关文章所有 4 个版本

[PDF] arxiv.org

Multi-task learning of generalizable representations for video action recognition

Z Yao, Y Wang, M Long, J Wang… - … on Multimedia and …, 2020 - ieeexplore.ieee.org

In classic video action recognition, labels may not contain enough information about the
diverse video appearance and dynamics, thus, existing models that are trained under the …

被引用次数：5 相关文章所有 6 个版本

[PDF] thecvf.com

Stm: Spatiotemporal and motion encoding for action recognition

B Jiang, MM Wang, W Gan, W Wu… - Proceedings of the …, 2019 - openaccess.thecvf.com

Spatiotemporal and motion features are two complementary and crucial information for
video action recognition. Recent state-of-the-art methods adopt a 3D CNN stream to learn …

被引用次数：499 相关文章所有 6 个版本

高级搜索

QQ 群

Dynamic normalization and relay for video action recognition

More is less: Learning efficient video representations by big-little network and depthwise temporal aggregation

Deep analysis of cnn-based spatio-temporal representations for action recognition

A large-scale robustness analysis of video action recognition models

Mitigating representation bias in action recognition: Algorithms and benchmarks

Mfi: Multi-range feature interchange for video action recognition

Gate-shift networks for video action recognition

Video jigsaw: Unsupervised learning of spatiotemporal context for video action recognition

Multi-task learning of generalizable representations for video action recognition

Stm: Spatiotemporal and motion encoding for action recognition

相关搜索

引用