STREAMER: Streaming representation learning and event segmentation in a hierarchical manner

Representing Part-Whole Hierarchies in Foundation Models by Learning Localizability Composability and Decomposability from Anatomy via Self Supervision

MRH Taher, MB Gotway… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Humans effortlessly interpret images by parsing them into part-whole hierarchies; deep
learning excels in learning multi-level feature spaces but they often lack explicit coding of …

被引用次数：1 相关文章所有 2 个版本

[PDF] arxiv.org

Hierarchical Vector Quantization for Unsupervised Action Segmentation

F Spurio, E Bahrami, G Francesca, J Gall - arXiv preprint arXiv:2412.17640, 2024 - arxiv.org

In this work, we address unsupervised temporal action segmentation, which segments a set
of long, untrimmed videos into semantically meaningful segments that are consistent across …

Predictive Attractor Models

R Mounir, S Sarkar - arXiv preprint arXiv:2410.02430, 2024 - arxiv.org

Sequential memory, the ability to form and accurately recall a sequence of events or stimuli
in the correct order, is a fundamental prerequisite for biological and artificial intelligence as it …

An Empirical Analysis of Speech Self-Supervised Learning at Multiple Resolutions

T Clark, B Cevoli, E de Jong, T Abramski… - arXiv preprint arXiv …, 2024 - arxiv.org

Self-supervised learning (SSL) models have become crucial in speech processing, with
recent advancements concentrating on developing architectures that capture …

Self-supervised Multi-actor Social Activity Understanding in Streaming Videos

S Trehan, SN Aakur - International Conference on Pattern Recognition, 2025 - Springer

This work addresses the problem of Social Activity Recognition (SAR), a critical component
in real-world tasks like surveillance and assistive robotics. Unlike traditional event …

[PDF] arxiv.org

About Time: Advances, Challenges, and Outlooks of Action Understanding

A Stergiou, R Poppe - arXiv preprint arXiv:2411.15106, 2024 - arxiv.org

We have witnessed impressive advances in video action understanding. Increased dataset
sizes, variability, and computation availability have enabled leaps in performance and task …

[PDF][PDF] Supplementary Materials for Representing Part-Whole Hierarchies in Foundation Models by Learning Localizability, Composability, and Decomposability from …

MRH Taher, MB Gotway, J Liang - openaccess.thecvf.com

Supplementary Materials for Representing Part-Whole Hierarchies in Foundation Models by
Learning Localizability, Composability, Page 1 Supplementary Materials for Representing …

高级搜索

QQ 群