Temporal action segmentation: An analysis of modern techniques

G Ding, F Sener, A Yao - IEEE Transactions on Pattern Analysis …, 2023 - ieeexplore.ieee.org
Temporal action segmentation (TAS) in videos aims at densely identifying video frames in
minutes-long videos with multiple action classes. As a long-range video understanding task …

Progress-aware online action segmentation for egocentric procedural task videos

Y Shen, E Elhamifar - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com
We address the problem of online action segmentation for egocentric procedural task
videos. While previous studies have mostly focused on offline action segmentation where …

Joint internal multi-interest exploration and external domain alignment for cross domain sequential recommendation

W Liu, X Zheng, C Chen, J Su, X Liao, M Hu… - Proceedings of the ACM …, 2023 - dl.acm.org
Sequential Cross-Domain Recommendation (CDR) has been popularly studied to utilize
different domain knowledge and users' historical behaviors for the next-item prediction. In …

Weakly supervised video representation learning with unaligned text for sequential videos

S Dong, H Hu, D Lian, W Luo… - Proceedings of the …, 2023 - openaccess.thecvf.com
Sequential video understanding, as an emerging video understanding task, has driven lots
of researchers' attention because of its goal-oriented nature. This paper studies weakly …

STREAMER: Streaming representation learning and event segmentation in a hierarchical manner

R Mounir, S Vijayaraghavan… - Advances in Neural …, 2024 - proceedings.neurips.cc
We present a novel self-supervised approach for hierarchical representation learning and
segmentation of perceptual inputs in a streaming fashion. Our research addresses how to …

Inductive and transductive few-shot video classification via appearance and temporal alignments

KD Nguyen, QH Tran, K Nguyen, BS Hua… - European Conference on …, 2022 - Springer
We present a novel method for few-shot video classification, which performs appearance
and temporal alignments. In particular, given a pair of query and support videos, we conduct …

Vlmah: Visual-linguistic modeling of action history for effective action anticipation

V Manousaki, K Bacharidis… - Proceedings of the …, 2023 - openaccess.thecvf.com
Although existing methods for action anticipation have shown considerably improved
performance on the predictability of future events in videos, the way they exploit information …

Weakly-supervised online action segmentation in multi-view instructional videos

R Ghoddoosian, I Dwivedi, N Agarwal… - Proceedings of the …, 2022 - openaccess.thecvf.com
This paper addresses a new problem of weakly-supervised online action segmentation in
instructional videos. We present a framework to segment streaming videos online at test time …

Learning representations by contrastive spatio-temporal clustering for skeleton-based action recognition

M Wang, X Li, S Chen, X Zhang, L Ma… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Self-supervised representation learning has proven constructive for skeleton-based action
recognition. For better performance, existing methods mainly focus on 1) multi-modal data …

Exploring Temporal Concurrency for Video-Language Representation Learning

H Zhang, D Liu, Z Lv, B Su… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Paired video and language data is naturally temporal concurrency, which requires the
modeling of the temporal dynamics within each modality and the temporal alignment across …