Ego4d: Around the world in 3,000 hours of egocentric video

K Grauman, A Westbury, E Byrne… - Proceedings of the …, 2022 - openaccess.thecvf.com
We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite. It
offers 3,670 hours of daily-life activity video spanning hundreds of scenarios (household …

Decoupling human and camera motion from videos in the wild

V Ye, G Pavlakos, J Malik… - Proceedings of the …, 2023 - openaccess.thecvf.com
We propose a method to reconstruct global human trajectories from videos in the wild. Our
optimization method decouples the camera and human motion, which allows us to place …

Intergen: Diffusion-based multi-human motion generation under complex interactions

H Liang, W Zhang, W Li, J Yu, L Xu - International Journal of Computer …, 2024 - Springer
We have recently seen tremendous progress in diffusion advances for generating realistic
human motions. Yet, they largely disregard the multi-human interactions. In this paper, we …

Ego-body pose estimation via ego-head pose estimation

J Li, K Liu, J Wu - Proceedings of the IEEE/CVF Conference …, 2023 - openaccess.thecvf.com
Estimating 3D human motion from an egocentric video sequence plays a critical role in
human behavior understanding and has various applications in VR/AR. However, naively …

Imuposer: Full-body pose estimation using imus in phones, watches, and earbuds

V Mollyn, R Arakawa, M Goel, C Harrison… - Proceedings of the 2023 …, 2023 - dl.acm.org
Tracking body pose on-the-go could have powerful uses in fitness, mobile gaming, context-
aware virtual assistants, and rehabilitation. However, users are unlikely to buy and wear …

Gimo: Gaze-informed human motion prediction in context

Y Zheng, Y Yang, K Mo, J Li, T Yu, Y Liu, CK Liu… - … on Computer Vision, 2022 - Springer
Predicting human motion is critical for assistive robots and AR/VR applications, where the
interaction with humans needs to be safe and comfortable. Meanwhile, an accurate …

Egobody: Human body shape and motion of interacting people from head-mounted devices

S Zhang, Q Ma, Y Zhang, Z Qian, T Kwon… - European conference on …, 2022 - Springer
Understanding social interactions from egocentric views is crucial for many applications,
ranging from assistive robotics to AR/VR. Key to reasoning about interactions is to …

Learning state-aware visual representations from audible interactions

H Mittal, P Morgado, U Jain… - Advances in Neural …, 2022 - proceedings.neurips.cc
We propose a self-supervised algorithm to learn representations from egocentric video data.
Recently, significant efforts have been made to capture humans interacting with their own …

My view is the best view: Procedure learning from egocentric videos

S Bansal, C Arora, CV Jawahar - European Conference on Computer …, 2022 - Springer
Procedure learning involves identifying the key-steps and determining their logical order to
perform a task. Existing approaches commonly use third-person videos for learning the …

Egotaskqa: Understanding human tasks in egocentric videos

B Jia, T Lei, SC Zhu, S Huang - Advances in Neural …, 2022 - proceedings.neurips.cc
Understanding human tasks through video observations is an essential capability of
intelligent agents. The challenges of such capability lie in the difficulty of generating a …