作者
Long Fan, Lei Xie, Xinran Lu, Yi Li, Chuyu Wang, Sanglu Lu
发表日期
2023/5/17
研讨会论文
IEEE INFOCOM 2023-IEEE Conference on Computer Communications
页码范围
1-10
出版商
IEEE
简介
With the proliferation of voice assistants, microphone-based speech recognition technology usually cannot achieve good performance in the situation of multiple sound sources and ambient noises. In this paper, we propose a novel mmWave-based solution to perform speech recognition to tackle the issues of multiple sound sources and ambient noises, by precisely extracting the multi-modal features from lip motion and vocal-cords vibration from the single channel of mmWave. We propose a difference-based method for feature extraction of lip motion to suppress the dynamic interference from body motion and head motion. We propose a speech detection method based on cross-validation of lip motion and vocal-cords vibration so as to avoid wasting computing resources on nonspeaking activities. We propose a multi-modal fusion framework for speech recognition by fusing the signal features from lip motion and …
引用总数
学术搜索中的文章
L Fan, L Xie, X Lu, Y Li, C Wang, S Lu - IEEE INFOCOM 2023-IEEE Conference on Computer …, 2023