In this work, we address unsupervised temporal action segmentation, which segments a set of long, untrimmed videos into semantically meaningful segments that are consistent across …
R Mounir, S Sarkar - arXiv preprint arXiv:2410.02430, 2024 - arxiv.org
Sequential memory, the ability to form and accurately recall a sequence of events or stimuli in the correct order, is a fundamental prerequisite for biological and artificial intelligence as it …
T Clark, B Cevoli, E de Jong, T Abramski… - arXiv preprint arXiv …, 2024 - arxiv.org
Self-supervised learning (SSL) models have become crucial in speech processing, with recent advancements concentrating on developing architectures that capture …
S Trehan, SN Aakur - International Conference on Pattern Recognition, 2025 - Springer
This work addresses the problem of Social Activity Recognition (SAR), a critical component in real-world tasks like surveillance and assistive robotics. Unlike traditional event …
A Stergiou, R Poppe - arXiv preprint arXiv:2411.15106, 2024 - arxiv.org
We have witnessed impressive advances in video action understanding. Increased dataset sizes, variability, and computation availability have enabled leaps in performance and task …
Supplementary Materials for Representing Part-Whole Hierarchies in Foundation Models by Learning Localizability, Composability, Page 1 Supplementary Materials for Representing …