This paper proposes multistream CNN, a novel neural network architecture for robust acoustic modeling in speech recognition tasks. The proposed architecture processes input …
This paper makes several contributions to automatic lyrics transcription (ALT) research. Our main contribution is a novel variant of the Multistreaming Time-Delay Neural Network …
H Hermansky - Speech Communication, 2019 - Elsevier
This paper postulates that linguistic message in speech is coded redundantly in both the time and the frequency domains. Such redundant coding of the message in the signal …
We propose an environment adaptation approach that improves deep speech enhancement models via minimizing the Kullback-Leibler divergence between posterior probabilities …
SH Mallidi, H Hermansky - 2016 IEEE International Conference …, 2016 - ieeexplore.ieee.org
Robustness of automatic speech recognition (ASR) to acoustic mismatches can be improved by multistream framework. Frequently used approach to combine decisions from individual …
SH Mallidi, T Ogawa… - 2015 IEEE Workshop on …, 2015 - ieeexplore.ieee.org
New efficient measures for estimating uncertainty of deep neural network (DNN) classifiers are proposed and successfully applied to multistream-based unsupervised adaptation of …
In this paper, we propose a novel method to capture energy modulations from different frequency bands in speech into frame-level feature vectors, called Modulation-vectors or M …
Exploiting multiple microphones has been a widely-used strategy for robust automatic speech recognition (ASR). Particularly, in a general hands-free scenario, acquisition of …
Quality evaluation of digitally-transmitted speech is an important prerequisite to ensure the required quality of telecommunication service. Although formal subjective listening tests still …