Dip: Dual incongruity perceiving network for sarcasm detection

C Wen, G Jia, J Yang - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com
Sarcasm indicates the literal meaning is contrary to the real attitude. Considering the
popularity and complementarity of image-text data, we investigate the task of multi-modal …

Mart: Masked affective representation learning via masked temporal distribution distillation

Z Zhang, P Zhao, E Park… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Limited training data is a long-standing problem for video emotion analysis (VEA). Existing
works leverage the power of large-scale image datasets for transferring while failing to …

Adapt or perish: Adaptive sparse transformer with attentive feature refinement for image restoration

S Zhou, D Chen, J Pan, J Shi… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Transformer-based approaches have achieved promising performance in image restoration
tasks given their ability to model long-range dependencies which is crucial for recovering …

Extdm: Distribution extrapolation diffusion model for video prediction

Z Zhang, J Hu, W Cheng, D Paudel… - Proceedings of the …, 2024 - openaccess.thecvf.com
Video prediction is a challenging task due to its nature of uncertainty especially for
forecasting a long period. To model the temporal dynamics advanced methods benefit from …

Weakly supervised video emotion detection and prediction via cross-modal temporal erasing network

Z Zhang, L Wang, J Yang - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com
Automatically predicting the emotions of user-generated videos (UGVs) receives increasing
interest recently. However, existing methods mainly focus on a few key visual frames, which …

Lake-red: Camouflaged images generation by latent background knowledge retrieval-augmented diffusion

P Zhao, P Xu, P Qin, DP Fan, Z Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com
Camouflaged vision perception is an important vision task with numerous practical
applications. Due to the expensive collection and labeling costs this community struggles …

Ordinal label distribution learning

C Wen, X Zhang, X Yao, J Yang - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Label distribution learning (LDL) is a recent hot topic, in which ambiguity is modeled via
description degrees of the labels. However, in common LDL tasks, eg, age estimation, labels …

AVES: An Audio-Visual Emotion Stream Dataset for Temporal Emotion Detection

Y Li, W Gan, K Lu, D Jiang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Human emotions vary over time, which can be vividly described as a stream of emotions.
Observing the emotion stream in daily life provides valuable insights into an individual's …

Going Beyond Closed Sets: A Multimodal Perspective for Video Emotion Analysis

H Pu, Y Sun, R Song, X Chen, H Jiang, Y Liu… - Chinese Conference on …, 2023 - Springer
Emotion analysis plays a crucial role in understanding video content. Existing studies often
approach it as a closed set classification task, which overlooks the important fact that the …

Hawkeye: Discovering and Grounding Implicit Anomalous Sentiment in Recon-videos via Scene-enhanced Video Large Language Model

J Zhao, J Wang, Y Jin, J Luo, G Zhou - ACM Multimedia 2024, 2024 - openreview.net
In real-world recon-videos such as surveillance and drone reconnaissance videos,
commonly used explicit language, acoustic and facial expressions information is often …