Increased global competition has placed a premium on customer satisfaction, and there is a greater demand for manufacturers to be flexible with their products and services. This …
We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite. It offers 3,670 hours of daily-life activity video spanning hundreds of scenarios (household …
Building a robot that can understand and learn to interact by watching humans has inspired several vision problems. However, despite some successful results on static datasets, it …
R Girdhar, K Grauman - Proceedings of the IEEE/CVF …, 2021 - openaccess.thecvf.com
Abstract We propose Anticipative Video Transformer (AVT), an end-to-end attention-based video modeling architecture that attends to the previously observed video in order to …
J Fan, P Zheng, S Li - Robotics and Computer-Integrated Manufacturing, 2022 - Elsevier
Recently human–robot collaboration (HRC) has emerged as a promising paradigm for mass personalization in manufacturing owing to the potential to fully exploit the strength of human …
This paper introduces the pipeline to extend the largest dataset in egocentric vision, EPIC- KITCHENS. The effort culminates in EPIC-KITCHENS-100, a collection of 100 hours, 20M …
Y Hong, Q Wu, Y Qi… - Proceedings of the …, 2021 - openaccess.thecvf.com
Accuracy of many visiolinguistic tasks has benefited significantly from the application of vision-and-language (V&L) BERT. However, its application for the task of vision-and …
We propose novel dynamic multiscale graph neural networks (DMGNN) to predict 3D skeleton-based human motions. The core idea of DMGNN is to use a multiscale graph to …
Y Kong, Y Fu - International Journal of Computer Vision, 2022 - Springer
Derived from rapid advances in computer vision and machine learning, video analysis tasks have been moving from inferring the present state to predicting the future state. Vision-based …