A comprehensive review of polyphonic sound event detection

TK Chan, CS Chin - IEEE Access, 2020 - ieeexplore.ieee.org
One of the most amazing functions of the human auditory system is the ability to detect all
kinds of sound events in the environment. With the technologies and hardware advances …

Processing multi-channel audio waveforms

TN Sainath, RJ Weiss, KW Wilson, AW Senior… - US Patent …, 2017 - Google Patents
Methods, including computer programs encoded on a com puter storage medium, for
enhancing the processing of audio waveforms for speech recognition using various neural …

Exploring multi-channel features for denoising-autoencoder-based speech enhancement

S Araki, T Hayashi, M Delcroix… - … , Speech and Signal …, 2015 - ieeexplore.ieee.org
This paper investigates a multi-channel denoising autoencoder (DAE)-based speech
enhancement approach. In recent years, deep neural network (DNN)-based monaural …

Improving speech recognition in reverberation using a room-aware deep neural network and multi-task learning

R Giri, ML Seltzer, J Droppo… - 2015 IEEE International …, 2015 - ieeexplore.ieee.org
In this paper, we propose two approaches to improve deep neural network (DNN) acoustic
models for speech recognition in reverberant environments. Both methods utilize auxiliary …

Non-native children speech recognition through transfer learning

M Matassoni, R Gretter, D Falavigna… - … on Acoustics, Speech …, 2018 - ieeexplore.ieee.org
This work deals with non-native children's speech and investigates both multi-task and
transfer learning approaches to adapt a multi-language Deep Neural Network (DNN) to …

Noise robust speech recognition using recent developments in neural networks for computer vision

T Yoshioka, K Ohnishi, F Fang… - 2016 IEEE International …, 2016 - ieeexplore.ieee.org
Convolutional Neural Networks (CNNs) are superior to fully connected neural networks in
various speech recognition tasks and the advantage is pronounced in noisy environments …

[HTML][HTML] Environmentally robust ASR front-end for deep neural network acoustic models

T Yoshioka, MJF Gales - Computer Speech & Language, 2015 - Elsevier
This paper examines the individual and combined impacts of various front-end approaches
on the performance of deep neural network (DNN) based speech recognition systems in …

[HTML][HTML] Robust speech recognition via anchor word representations

B King, IF Chen, Y Vaizman, Y Liu, R Maas… - 2017 - amazon.science
A challenge for speech recognition for voice-controlled household devices, like the Amazon
Echo or Google Home, is robustness against interfering background speech. Formulated as …

Meeting recognition with asynchronous distributed microphone array using block-wise refinement of mask-based MVDR beamformer

S Araki, N Ono, K Kinoshita… - 2018 IEEE International …, 2018 - ieeexplore.ieee.org
This paper addresses a front-end system for speech recognition of spontaneous
conversational speech signals that are recorded with asynchronous distributed microphones …

[PDF][PDF] Multi-channel attention for end-to-end speech recognition

S Braun, D Neil, J Anumula, E Ceolini, SC Liu - 2018 - zora.uzh.ch
Recent end-to-end models for automatic speech recognition use sensory attention to
integrate multiple input channels within a single neural network. However, these attention …