Temporal sentiment localization: Listen and look in untrimmed videos

C Wen, G Jia, J Yang - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com

Sarcasm indicates the literal meaning is contrary to the real attitude. Considering the
popularity and complementarity of image-text data, we investigate the task of multi-modal …

被引用次数：26 相关文章所有 3 个版本

[PDF] thecvf.com

Mart: Masked affective representation learning via masked temporal distribution distillation

Z Zhang, P Zhao, E Park… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Limited training data is a long-standing problem for video emotion analysis (VEA). Existing
works leverage the power of large-scale image datasets for transferring while failing to …

被引用次数：5 相关文章

[PDF] thecvf.com

Adapt or perish: Adaptive sparse transformer with attentive feature refinement for image restoration

S Zhou, D Chen, J Pan, J Shi… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Transformer-based approaches have achieved promising performance in image restoration
tasks given their ability to model long-range dependencies which is crucial for recovering …

被引用次数：4 相关文章

[PDF] thecvf.com

Extdm: Distribution extrapolation diffusion model for video prediction

Z Zhang, J Hu, W Cheng, D Paudel… - Proceedings of the …, 2024 - openaccess.thecvf.com

Video prediction is a challenging task due to its nature of uncertainty especially for
forecasting a long period. To model the temporal dynamics advanced methods benefit from …

被引用次数：6 相关文章所有 3 个版本

[PDF] thecvf.com

Weakly supervised video emotion detection and prediction via cross-modal temporal erasing network

Z Zhang, L Wang, J Yang - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com

Automatically predicting the emotions of user-generated videos (UGVs) receives increasing
interest recently. However, existing methods mainly focus on a few key visual frames, which …

被引用次数：16 相关文章所有 4 个版本

[PDF] thecvf.com

Lake-red: Camouflaged images generation by latent background knowledge retrieval-augmented diffusion

P Zhao, P Xu, P Qin, DP Fan, Z Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com

Camouflaged vision perception is an important vision task with numerous practical
applications. Due to the expensive collection and labeling costs this community struggles …

被引用次数：4 相关文章所有 3 个版本

[PDF] thecvf.com

Ordinal label distribution learning

C Wen, X Zhang, X Yao, J Yang - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Label distribution learning (LDL) is a recent hot topic, in which ambiguity is modeled via
description degrees of the labels. However, in common LDL tasks, eg, age estimation, labels …

被引用次数：7 相关文章所有 3 个版本

AVES: An Audio-Visual Emotion Stream Dataset for Temporal Emotion Detection

Y Li, W Gan, K Lu, D Jiang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Human emotions vary over time, which can be vividly described as a stream of emotions.
Observing the emotion stream in daily life provides valuable insights into an individual's …

Going Beyond Closed Sets: A Multimodal Perspective for Video Emotion Analysis

H Pu, Y Sun, R Song, X Chen, H Jiang, Y Liu… - Chinese Conference on …, 2023 - Springer

Emotion analysis plays a crucial role in understanding video content. Existing studies often
approach it as a closed set classification task, which overlooks the important fact that the …

被引用次数：2 相关文章所有 3 个版本

[PDF] openreview.net

Hawkeye: Discovering and Grounding Implicit Anomalous Sentiment in Recon-videos via Scene-enhanced Video Large Language Model

J Zhao, J Wang, Y Jin, J Luo, G Zhou - ACM Multimedia 2024, 2024 - openreview.net

In real-world recon-videos such as surveillance and drone reconnaissance videos,
commonly used explicit language, acoustic and facial expressions information is often …

高级搜索

QQ 群