A temporal order modeling approach to human action recognition from multimodal sensor data

J Liu, S Song, C Liu, Y Li, Y Hu - ACM Transactions on Multimedia …, 2020 - dl.acm.org

Large-scale benchmarks provide a solid foundation for the development of action analytics.
Most of the previous activity benchmarks focus on analyzing actions in RGB videos. There is …

被引用次数：74 相关文章所有 2 个版本

[PDF] github.io

Online early-late fusion based on adaptive hmm for sign language recognition

D Guo, W Zhou, H Li, M Wang - ACM Transactions on Multimedia …, 2017 - dl.acm.org

In sign language recognition (SLR) with multimodal data, a sign word can be represented by
multiply features, for which there exist an intrinsic property and a mutually complementary …

被引用次数：77 相关文章所有 3 个版本

[PDF] ieee.org

Multimodal fusion based on LSTM and a couple conditional hidden Markov model for Chinese sign language recognition

Q Xiao, M Qin, P Guo, Y Zhao - IEEE Access, 2019 - ieeexplore.ieee.org

A novel multimodal fusion approach is proposed for Chinese sign language (CSL)
recognition. This framework, the LSTM2+ CHMM model, uses dual long short-term memory …

被引用次数：44 相关文章所有 4 个版本

Egocentric early action prediction via adversarial knowledge distillation

N Zheng, X Song, T Su, W Liu, Y Yan… - ACM Transactions on …, 2023 - dl.acm.org

Egocentric early action prediction aims to recognize actions from the first-person view by
only observing a partial video segment, which is challenging due to the limited context …

被引用次数：15 相关文章

Knowledge-driven egocentric multimodal activity recognition

Y Huang, X Yang, J Gao, J Sang, C Xu - ACM Transactions on …, 2020 - dl.acm.org

Recognizing activities from egocentric multimodal data collected by wearable cameras and
sensors, is gaining interest, as multimodal methods always benefit from the complementarity …

被引用次数：27 相关文章

A differentiable parallel sampler for efficient video classification

X Wang, L Zhu, F Wu, Y Yang - ACM Transactions on Multimedia …, 2023 - dl.acm.org

It is crucial to sample a small portion of relevant frames for efficient video classification. The
existing methods mainly develop hand-designed sampling strategies or learn sequential …

被引用次数：8 相关文章所有 2 个版本

Joint transferable dictionary learning and view adaptation for multi-view human action recognition

B Sun, D Kong, S Wang, L Wang, B Yin - ACM Transactions on …, 2021 - dl.acm.org

Multi-view human action recognition remains a challenging problem due to large view
changes. In this article, we propose a transfer learning-based framework called transferable …

被引用次数：11 相关文章

Multi-View Representation Learning via View-Aware Modulation

R Wang, H Sun, X Nie, Y Lin, X Xi, Y Yin - Proceedings of the 31st ACM …, 2023 - dl.acm.org

Multi-view (representation) learning derives an entity's representation from its multiple
observable views to facilitate various downstream tasks. The most challenging topic is how …

被引用次数：1 相关文章

Multimodal Score Fusion with Sparse Low-rank Bilinear Pooling for Egocentric Hand Action Recognition

K Roy - ACM Transactions on Multimedia Computing …, 2024 - dl.acm.org

With the advent of egocentric cameras, there are new challenges where traditional computer
vision is not sufficient to handle this kind of video. Moreover, egocentric cameras often offer …

被引用次数：2 相关文章

[PDF] aaai.org

Learning to adaptively scale recurrent neural networks

H Hu, L Wang, GJ Qi - Proceedings of the AAAI Conference on Artificial …, 2019 - aaai.org

Recent advancements in recurrent neural network (RNN) research have demonstrated the
superiority of utilizing multiscale structures in learning temporal representations of time …

被引用次数：11 相关文章所有 11 个版本

高级搜索

QQ 群