Speech emotion recognition: A brief review of multi-modal multi-task learning approaches

文章

学术资源搜索

获得 4 条结果（用时0.02秒）

我的图书馆

Speech emotion recognition: A brief review of multi-modal multi-task learning approaches

在引用文章中搜索

[HTML] sciencedirect.com

[HTML][HTML] Hybrid data augmentation and deep attention-based dilated convolutional-recurrent neural networks for speech emotion recognition

NT Pham, DNM Dang, ND Nguyen, TT Nguyen… - Expert Systems with …, 2023 - Elsevier

Recently, speech emotion recognition (SER) has become an active research area in speech
processing, particularly with the advent of deep learning (DL). Numerous DL-based methods …

被引用次数：27 相关文章所有 8 个版本

[PDF] researchgate.net

SERVER: Multi-modal speech emotion recognition using transformer-based and vision-based embeddings

NT Pham, DNM Dang, BNH Pham… - Proceedings of the 2023 …, 2023 - dl.acm.org

This paper proposes a multi-modal approach for speech emotion recognition (SER) using
both text and audio inputs. The audio embedding is extracted by using a vision-based …

被引用次数：7 相关文章所有 2 个版本

Multi-modal speech emotion recognition: Improving accuracy through fusion of vggish and bert features with multi-head attention

PN Tran, TDT Vu, DNM Dang, NT Pham… - … Conference on Industrial …, 2023 - Springer

Recent research has shown that multi-modal learning is a successful method for enhancing
classification performance by mixing several forms of input, notably in speech-emotion …

被引用次数：3 相关文章所有 2 个版本

[PDF] researchgate.net

SER-Fuse: An Emotion Recognition Application Utilizing Multi-Modal, Multi-Lingual, and Multi-Feature Fusion

NT Pham, LT Phan, DNM Dang… - Proceedings of the 12th …, 2023 - dl.acm.org

Speech emotion recognition (SER) is a crucial aspect of affective computing and human-
computer interaction, yet effectively identifying emotions in different speakers and languages …

高级搜索

QQ 群

Speech emotion recognition: A brief review of multi-modal multi-task learning approaches

[HTML][HTML] Hybrid data augmentation and deep attention-based dilated convolutional-recurrent neural networks for speech emotion recognition

SERVER: Multi-modal speech emotion recognition using transformer-based and vision-based embeddings

Multi-modal speech emotion recognition: Improving accuracy through fusion of vggish and bert features with multi-head attention

SER-Fuse: An Emotion Recognition Application Utilizing Multi-Modal, Multi-Lingual, and Multi-Feature Fusion

引用