E Vahdani, Y Tian - IEEE Transactions on Pattern Analysis and …, 2022 - ieeexplore.ieee.org
Understanding human behavior and activity facilitates advancement of numerous real-world applications, and is critical for video analysis. Despite the progress of action recognition …
Learning text-video embeddings usually requires a dataset of video clips with manually provided captions. However, such datasets are expensive and time consuming to create and …
This paper introduces the pipeline to extend the largest dataset in egocentric vision, EPIC- KITCHENS. The effort culminates in EPIC-KITCHENS-100, a collection of 100 hours, 20M …
YA Farha, J Gall - Proceedings of the IEEE/CVF conference …, 2019 - openaccess.thecvf.com
Temporally locating and classifying action segments in long untrimmed videos is of particular interest to many applications like surveillance and robotics. While traditional …
This paper introduces a unified framework for video action segmentation via sequence to sequence (seq2seq) translation in a fully and timestamp supervised setup. In contrast to …
Y Kong, Y Fu - International Journal of Computer Vision, 2022 - Springer
Derived from rapid advances in computer vision and machine learning, video analysis tasks have been moving from inferring the present state to predicting the future state. Vision-based …
Abstract Sign Language Recognition (SLR) has been an active research field for the last two decades. However, most research to date has considered SLR as a naive gesture …
The ability to identify and temporally segment fine-grained human actions throughout a video is crucial for robotics, surveillance, education, and beyond. Typical approaches …
Abstract We propose'Hide-and-Seek', a weakly-supervised framework that aims to improve object localization in images and action localization in videos. Most existing weakly …