A short note on the kinetics-700-2020 human action dataset

L Smaira, J Carreira, E Noland, E Clancy, A Wu… - arXiv preprint arXiv …, 2020 - arxiv.org
We describe the 2020 edition of the DeepMind Kinetics human action dataset, which
replenishes and extends the Kinetics-700 dataset. In this new version, there are at least 700 …

Knowing where to focus: Event-aware transformer for video grounding

J Jang, J Park, J Kim, H Kwon… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Recent DETR-based video grounding models have made the model directly predict moment
timestamps without any hand-crafted components, such as a pre-defined proposal or non …

The way to my heart is through contrastive learning: Remote photoplethysmography from unlabelled video

J Gideon, S Stent - Proceedings of the IEEE/CVF …, 2021 - openaccess.thecvf.com
The ability to reliably estimate physiological signals from video is a powerful tool in low-cost,
pre-clinical health monitoring. In this work we propose a new approach to remote …

Transrac: Encoding multi-scale temporal correlation with transformers for repetitive action counting

H Hu, S Dong, Y Zhao, D Lian, Z Li… - Proceedings of the …, 2022 - openaccess.thecvf.com
Counting repetitive actions are widely seen in human activities such as physical exercise.
Existing methods focus on performing repetitive action counting in short videos, which is …

Aifit: Automatic 3d human-interpretable feedback models for fitness training

M Fieraru, M Zanfir, SC Pirlea, V Olaru… - Proceedings of the …, 2021 - openaccess.thecvf.com
I went to the gym today, but how well did I do? And where should I improve? Ah, my back
hurts slightly... User engagement can be sustained and injuries avoided by being able to …

Zero-shot natural language video localization

J Nam, D Ahn, D Kang, SJ Ha… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
Understanding videos to localize moments with natural language often requires large
expensive annotated video regions paired with language queries. To eliminate the …

Simper: Simple self-supervised learning of periodic targets

Y Yang, X Liu, J Wu, S Borac, D Katabi, MZ Poh… - arXiv preprint arXiv …, 2022 - arxiv.org
From human physiology to environmental evolution, important processes in nature often
exhibit meaningful and strong periodic or quasi-periodic changes. Due to their inherent label …

Real-time monitoring for manual operations with machine vision in smart manufacturing

P Lou, J Li, YH Zeng, B Chen, X Zhang - Journal of Manufacturing Systems, 2022 - Elsevier
Online real-time production process monitoring is the basis for intelligent manufacturing
refinement management. This paper proposes a contactless monitoring framework with …

Weakly supervised video representation learning with unaligned text for sequential videos

S Dong, H Hu, D Lian, W Luo… - Proceedings of the …, 2023 - openaccess.thecvf.com
Sequential video understanding, as an emerging video understanding task, has driven lots
of researchers' attention because of its goal-oriented nature. This paper studies weakly …

Repetitive activity counting by sight and sound

Y Zhang, L Shao, CGM Snoek - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
This paper strives for repetitive activity counting in videos. Different from existing works,
which all analyze the visual video content only, we incorporate for the first time the …