A review of video object detection: Datasets, metrics and methods

H Zhu, H Wei, B Li, X Yuan, N Kehtarnavaz - Applied Sciences, 2020 - mdpi.com
Although there are well established object detection methods based on static images, their
application to video data on a frame by frame basis faces two shortcomings:(i) lack of …

Beyond supervised learning for pervasive healthcare

X Gu, F Deligianni, J Han, X Liu, W Chen… - IEEE Reviews in …, 2023 - ieeexplore.ieee.org
The integration of machine/deep learning and sensing technologies is transforming
healthcare and medical practice. However, inherent limitations in healthcare data, namely …

Ego4d: Around the world in 3,000 hours of egocentric video

K Grauman, A Westbury, E Byrne… - Proceedings of the …, 2022 - openaccess.thecvf.com
We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite. It
offers 3,670 hours of daily-life activity video spanning hundreds of scenarios (household …

Pointodyssey: A large-scale synthetic dataset for long-term point tracking

Y Zheng, AW Harley, B Shen… - Proceedings of the …, 2023 - openaccess.thecvf.com
We introduce PointOdyssey, a large-scale synthetic dataset, and data generation framework,
for the training and evaluation of long-term fine-grained tracking algorithms. Our goal is to …

Cafe: Learning to condense dataset by aligning features

K Wang, B Zhao, X Peng, Z Zhu… - Proceedings of the …, 2022 - openaccess.thecvf.com
Dataset condensation aims at reducing the network training effort through condensing a
cumbersome training set into a compact synthetic one. State-of-the-art approaches largely …

Memvit: Memory-augmented multiscale vision transformer for efficient long-term video recognition

CY Wu, Y Li, K Mangalam, H Fan… - Proceedings of the …, 2022 - openaccess.thecvf.com
While today's video recognition systems parse snapshots or short clips accurately, they
cannot connect the dots and reason across a longer range of time yet. Most existing video …

A comprehensive study of deep video action recognition

Y Zhu, X Li, C Liu, M Zolfaghari, Y Xiong, C Wu… - arXiv preprint arXiv …, 2020 - arxiv.org
Video action recognition is one of the representative tasks for video understanding. Over the
last decade, we have witnessed great advancements in video action recognition thanks to …

Hybrid relation guided set matching for few-shot action recognition

X Wang, S Zhang, Z Qing, M Tang… - Proceedings of the …, 2022 - openaccess.thecvf.com
Current few-shot action recognition methods reach impressive performance by learning
discriminative features for each video via episodic training and designing various temporal …

Molo: Motion-augmented long-short contrastive learning for few-shot action recognition

X Wang, S Zhang, Z Qing, C Gao… - Proceedings of the …, 2023 - openaccess.thecvf.com
Current state-of-the-art approaches for few-shot action recognition achieve promising
performance by conducting frame-level matching on learned visual features. However, they …

H2o: Two hands manipulating objects for first person interaction recognition

T Kwon, B Tekin, J Stühmer, F Bogo… - Proceedings of the …, 2021 - openaccess.thecvf.com
We present a comprehensive framework for egocentric interaction recognition using
markerless 3D annotations of two hands manipulating objects. To this end, we propose a …