作者
Kouhei Sekiguchi, Aditya Arie Nugraha, Yicheng Du, Yoshiaki Bando, Mathieu Fontaine, Kazuyoshi Yoshii
发表日期
2022/10/23
研讨会论文
2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
页码范围
9266-9273
出版商
IEEE
简介
This paper describes the practical response- and performance-aware development of online speech enhancement for an augmented reality (AR) headset that helps a user understand conversations made in real noisy echoic environments (e.g., cocktail party). One may use a state-of-the-art blind source separation method called fast multichannel nonnegative matrix factorization (FastMNMF) that works well in various environments thanks to its unsupervised nature. Its heavy computational cost, however, prevents its application to real-time processing. In contrast, a supervised beamforming method that uses a deep neural network (DNN) for estimating spatial information of speech and noise readily fits real-time processing, but suffers from drastic performance degradation in mismatched conditions. Given such complementary characteristics, we propose a dual-process robust online speech enhancement method …
引用总数
学术搜索中的文章
K Sekiguchi, AA Nugraha, Y Du, Y Bando, M Fontaine… - 2022 IEEE/RSJ International Conference on Intelligent …, 2022