Combining embedded accelerometers with computer vision for recognizing food preparation activities

Z Xing, Q Feng, H Chen, Q Dai, H Hu, H Xu… - ACM Computing …, 2024 - dl.acm.org

The recent wave of AI-generated content (AIGC) has witnessed substantial success in
computer vision, with the diffusion model playing a crucial role in this achievement. Due to …

被引用次数：90 相关文章所有 3 个版本

[PDF] arxiv.org

A survey on food computing

W Min, S Jiang, L Liu, Y Rui, R Jain - ACM Computing Surveys (CSUR), 2019 - dl.acm.org

Food is essential for human life and it is fundamental to the human experience. Food-related
study may support multifarious applications and services, such as guiding human behavior …

被引用次数：389 相关文章所有 13 个版本

[PDF] springer.com

Vision-based human activity recognition: a survey

DR Beddiar, B Nini, M Sabokrou, A Hadid - Multimedia Tools and …, 2020 - Springer

Human activity recognition (HAR) systems attempt to automatically identify and analyze
human activities using acquired information from various types of sensors. Although several …

被引用次数：459 相关文章所有 8 个版本

[PDF] thecvf.com

Anticipative video transformer

R Girdhar, K Grauman - Proceedings of the IEEE/CVF …, 2021 - openaccess.thecvf.com

Abstract We propose Anticipative Video Transformer (AVT), an end-to-end attention-based
video modeling architecture that attends to the previously observed video in order to …

被引用次数：247 相关文章所有 6 个版本

[PDF] springer.com

Rescaling egocentric vision: Collection, pipeline and challenges for epic-kitchens-100

D Damen, H Doughty, GM Farinella, A Furnari… - International Journal of …, 2022 - Springer

This paper introduces the pipeline to extend the largest dataset in egocentric vision, EPIC-
KITCHENS. The effort culminates in EPIC-KITCHENS-100, a collection of 100 hours, 20M …

被引用次数：542 相关文章所有 13 个版本

[PDF] thecvf.com

Diffusion action segmentation

D Liu, Q Li, AD Dinh, T Jiang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Temporal action segmentation is crucial for understanding long-form videos. Previous works
on this task commonly adopt an iterative refinement paradigm by using multi-stage models …

被引用次数：77 相关文章所有 5 个版本

[PDF] arxiv.org

Asformer: Transformer for action segmentation

F Yi, H Wen, T Jiang - arXiv preprint arXiv:2110.08568, 2021 - arxiv.org

Algorithms for the action segmentation task typically use temporal models to predict what
action is occurring at each frame for a minute-long daily activity. Recent studies have shown …

被引用次数：205 相关文章所有 5 个版本

[PDF] thecvf.com

Ms-tcn: Multi-stage temporal convolutional network for action segmentation

YA Farha, J Gall - Proceedings of the IEEE/CVF conference …, 2019 - openaccess.thecvf.com

Temporally locating and classifying action segments in long untrimmed videos is of
particular interest to many applications like surveillance and robotics. While traditional …

被引用次数：745 相关文章所有 10 个版本

[PDF] arxiv.org

Unified fully and timestamp supervised temporal action segmentation via sequence to sequence translation

N Behrmann, SA Golestaneh, Z Kolter, J Gall… - European conference on …, 2022 - Springer

This paper introduces a unified framework for video action segmentation via sequence to
sequence (seq2seq) translation in a fully and timestamp supervised setup. In contrast to …

被引用次数：91 相关文章所有 4 个版本

[PDF] thecvf.com

Scannet: Richly-annotated 3d reconstructions of indoor scenes

A Dai, AX Chang, M Savva, M Halber… - Proceedings of the …, 2017 - openaccess.thecvf.com

A key requirement for leveraging supervised deep learning methods is the availability of
large, labeled datasets. Unfortunately, in the context of RGB-D scene understanding, very …

被引用次数：4584 相关文章所有 10 个版本

高级搜索

QQ 群