Weakly supervised video emotion detection and prediction via cross-modal temporal erasing network

Z Zhang, P Zhao, E Park… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Limited training data is a long-standing problem for video emotion analysis (VEA). Existing
works leverage the power of large-scale image datasets for transferring while failing to …

被引用次数：5 相关文章

[PDF] thecvf.com

Adapt or perish: Adaptive sparse transformer with attentive feature refinement for image restoration

S Zhou, D Chen, J Pan, J Shi… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Transformer-based approaches have achieved promising performance in image restoration
tasks given their ability to model long-range dependencies which is crucial for recovering …

被引用次数：4 相关文章

[PDF] thecvf.com

Extdm: Distribution extrapolation diffusion model for video prediction

Z Zhang, J Hu, W Cheng, D Paudel… - Proceedings of the …, 2024 - openaccess.thecvf.com

Video prediction is a challenging task due to its nature of uncertainty especially for
forecasting a long period. To model the temporal dynamics advanced methods benefit from …

被引用次数：6 相关文章所有 3 个版本

[PDF] thecvf.com

Lake-red: Camouflaged images generation by latent background knowledge retrieval-augmented diffusion

P Zhao, P Xu, P Qin, DP Fan, Z Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com

Camouflaged vision perception is an important vision task with numerous practical
applications. Due to the expensive collection and labeling costs this community struggles …

被引用次数：4 相关文章所有 3 个版本

[PDF] thecvf.com

Ordinal label distribution learning

C Wen, X Zhang, X Yao, J Yang - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Label distribution learning (LDL) is a recent hot topic, in which ambiguity is modeled via
description degrees of the labels. However, in common LDL tasks, eg, age estimation, labels …

被引用次数：7 相关文章所有 3 个版本

Joint learning of video scene detection and annotation via multi-modal adaptive context network

Y Xu, L Pan, W Sang, HL Luo, L Li, P Wei… - Expert Systems with …, 2024 - Elsevier

The tasks of scene detection and annotation have gained impressive attention for
understanding video content. The main challenges lie in mitigating the error propagation of …

[PDF] arxiv.org

Hypergraph Multi-modal Large Language Model: Exploiting EEG and Eye-tracking Modalities to Evaluate Heterogeneous Responses for Video Understanding

M Wu, C Zhao, A Su, D Di, T Fu, D An, M He… - arXiv preprint arXiv …, 2024 - arxiv.org

Understanding of video creativity and content often varies among individuals, with
differences in focal points and cognitive levels across different ages, experiences, and …

Going Beyond Closed Sets: A Multimodal Perspective for Video Emotion Analysis

H Pu, Y Sun, R Song, X Chen, H Jiang, Y Liu… - Chinese Conference on …, 2023 - Springer

Emotion analysis plays a crucial role in understanding video content. Existing studies often
approach it as a closed set classification task, which overlooks the important fact that the …

被引用次数：2 相关文章所有 3 个版本

[PDF] arxiv.org

eMotions: A Large-Scale Dataset for Emotion Recognition in Short Videos

X Wu, H Sun, J Xue, R Zhai, X Kong, J Nie… - arXiv preprint arXiv …, 2023 - arxiv.org

Nowadays, short videos (SVs) are essential to information acquisition and sharing in our life.
The prevailing use of SVs to spread emotions leads to the necessity of emotion recognition …

Facial Affective Behavior Analysis with Instruction Tuning

Y Li, A Dao, W Bao, Z Tan, T Chen, H Liu… - arXiv preprint arXiv …, 2024 - arxiv.org

Facial affective behavior analysis (FABA) is crucial for understanding human mental states
from images. However, traditional approaches primarily deploy models to discriminate …

被引用次数：6 相关文章所有 2 个版本

高级搜索

QQ 群