查看文章

github.io 中的 [PDF]

Contextual and cross-modal interaction for multi-modal speech emotion recognition

作者

Dingkang Yang, Shuai Huang, Yang Liu, Lihua Zhang

发表日期

2022/9/29

期刊

IEEE Signal Processing Letters

卷号

页码范围

2093-2097

出版商

IEEE

简介

Speech emotion recognition combining linguistic content and audio signals in the dialog is a challenging task. Nevertheless, previous approaches have failed to explore emotion cues in contextual interactions and ignored the long-range dependencies between elements from different modalities. To tackle the above issues, this letter proposes a multimodal speech emotion recognition method using audio and text data. We first present a contextual transformer module to introduce contextual information via embedding the previous utterances between interlocutors, which enhances the emotion representation of the current utterance. Then, the proposed cross-modal transformer module focuses on the interactions between text and audio modalities, adaptively promoting the fusion from one modality to another. Furthermore, we construct associative topological relation over mini-batch and learn the association between …

引用总数

被引用次数：37

2022202320242 21 14

学术搜索中的文章

Contextual and cross-modal interaction for multi-modal speech emotion recognition

D Yang, S Huang, Y Liu, L Zhang - IEEE Signal Processing Letters, 2022

被引用次数：37 相关文章所有 3 个版本