What can a cook in Italy teach a mechanic in India? Action Recognition Generalisation Over...

C Plizzari, G Goletto, A Furnari, S Bansal… - International Journal of …, 2024 - Springer

What will the future be? We wonder! In this survey, we explore the gap between current
research in egocentric vision and the ever-anticipated future, where wearable computing …

被引用次数：15 相关文章所有 7 个版本

[PDF] arxiv.org

Rethinking CLIP-based Video Learners in Cross-Domain Open-Vocabulary Action Recognition

KY Lin, H Ding, J Zhou, YX Peng, Z Zhao… - arXiv preprint arXiv …, 2024 - arxiv.org

Contrastive Language-Image Pretraining (CLIP) has shown remarkable open-vocabulary
abilities across various image understanding tasks. Building upon this impressive success …

被引用次数：4 相关文章所有 2 个版本

[PDF] thecvf.com

A Backpack Full of Skills: Egocentric Video Understanding with Diverse Task Perspectives

SA Peirone, F Pistilli, A Alliegro… - Proceedings of the …, 2024 - openaccess.thecvf.com

Human comprehension of a video stream is naturally broad: in a few instants we are able to
understand what is happening the relevance and relationship of objects and forecast what …

A survey on deep learning techniques for action anticipation

Z Zhong, M Martin, M Voit, J Gall, J Beyerer - arXiv preprint arXiv …, 2023 - arxiv.org

The ability to anticipate possible future human actions is essential for a wide range of
applications, including autonomous driving and human-robot interaction. Consequently …

被引用次数：5 相关文章所有 3 个版本

[PDF] thecvf.com

What does CLIP know about peeling a banana?

C Cuttano, G Rosi, G Trivigno… - Proceedings of the …, 2024 - openaccess.thecvf.com

Humans show an innate capability to identify tools to support specific actions. The
association between objects parts and the actions they facilitate is usually named …

Human-Centric Transformer for Domain Adaptive Action Recognition

KY Lin, J Zhou, WS Zheng - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org

We study the domain adaptation task for action recognition, namely domain adaptive action
recognition, which aims to effectively transfer action recognition power from a label-sufficient …

Are Synthetic Data Useful for Egocentric Hand-Object Interaction Detection? An Investigation and the HOI-Synth Domain Adaptation Benchmark

R Leonardi, A Furnari, F Ragusa… - arXiv preprint arXiv …, 2023 - arxiv.org

In this study, we investigate the effectiveness of synthetic data in enhancing hand-object
interaction detection within the egocentric vision domain. We introduce a simulator able to …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

AFF-ttention! Affordances and Attention models for Short-Term Object Interaction Anticipation

L Mur-Labadia, R Martinez-Cantin, J Guerrero… - arXiv preprint arXiv …, 2024 - arxiv.org

Short-Term object-interaction Anticipation consists of detecting the location of the next-active
objects, the noun and verb categories of the interaction, and the time to contact from the …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

EgoNCE++: Do Egocentric Video-Language Models Really Understand Hand-Object Interactions?

B Xu, Z Wang, Y Du, S Zheng, Z Song, Q Jin - arXiv preprint arXiv …, 2024 - arxiv.org

Egocentric video-language pretraining is a crucial paradigm to advance the learning of
egocentric hand-object interactions (EgoHOI). Despite the great success on existing …

Multimodal Cross-Domain Few-Shot Learning for Egocentric Action Recognition

M Hatano, R Hachiuma, R Fuji, H Saito - arXiv preprint arXiv:2405.19917, 2024 - arxiv.org

We address a novel cross-domain few-shot learning task (CD-FSL) with multimodal input
and unlabeled target data for egocentric action recognition. This paper simultaneously …

高级搜索

QQ 群