Artificial intelligence (AI) has generated a plethora of new opportunities, potential and challenges for understanding and supporting learning. In this paper, we position human and …
As 3D facial avatars become more widely used for communication, it is critical that they faithfully convey emotion. Unfortunately, the best recent methods that regress parametric 3D …
H Fan, X Zhang, Y Xu, J Fang, S Zhang, X Zhao, J Yu - Information Fusion, 2024 - Elsevier
Depression stands as one of the most widespread psychological disorders and has garnered increasing attention. Currently, how to effectively achieve automatic multimodal …
N Le, K Nguyen, Q Tran, E Tjiputra… - Proceedings of the …, 2023 - openaccess.thecvf.com
Despite significant progress over the past few years, ambiguity is still a key challenge in Facial Expression Recognition (FER). It can lead to noisy and inconsistent annotation, which …
L Yang, Y Shen, Y Mao, L Cai - Proceedings of the AAAI Conference on …, 2022 - ojs.aaai.org
Emotion recognition in conversation (ERC) aims to detect the emotion label for each utterance. Motivated by recent studies which have proven that feeding training examples in …
G Li, D Ouyang, Y Yuan, W Li, Z Guo, X Qu… - IEEE Sensors …, 2022 - ieeexplore.ieee.org
As the most direct way to measure the true emotional states of humans, EEG-based emotion recognition has been widely used in affective computing applications. In this paper, we aim …
W Zheng, J Yu, R Xia, S Wang - … of the 61st Annual Meeting of the …, 2023 - aclanthology.org
Abstract Multimodal Emotion Recognition in Multiparty Conversations (MERMC) has recently attracted considerable attention. Due to the complexity of visual scenes in multi …
Z Zhang, L Li, G Cong, H Yin, Y Gao, C Yan… - Proceedings of the …, 2024 - dl.acm.org
Movie Dubbing aims to convert scripts into speeches that align with the given movie clip in both temporal and emotional aspects while preserving the vocal timbre of one brief …
Given a piece of text, a video clip and a reference audio, the movie dubbing (also known as visual voice clone, V2C) task aims to generate speeches that match the speaker's emotion …