作者
Soonil Kwon
发表日期
2021/4/1
期刊
Applied Soft Computing
卷号
102
页码范围
107101
出版商
Elsevier
简介
Speech emotion recognition (SER) is an active research field of digital signal processing and plays a crucial role in numerous applications of Human–computer interaction (HCI). Nowadays, the baseline state of the art systems has quite a low accuracy and high computations, which needs upgrading to make it reasonable for real-time industrial uses such as detection of content from speech data. The main intent for low recognition rate and high computational cost is a scarceness of datasets, model configuration, and patterns recognition that is the supreme stimulating work for building a robust SER system. In this study, we address these problems and propose a simple and lightweight deep learning-based self-attention module (SAM) for SER system. The transitional features map is given to SAM, which produces efficiently the channel and spatial axes attention map with insignificant overheads. We use a multi-layer …
引用总数