Data augmentation for deep neural network acoustic modeling

X Cui, V Goel, B Kingsbury - IEEE/ACM Transactions on Audio …, 2015 - ieeexplore.ieee.org
This paper investigates data augmentation for deep neural network acoustic modeling
based on label-preserving transformations to deal with data sparsity. Two data …

Method and system for efficient spoken term detection using confusion networks

BED Kingsbury, HK Kuo, L Mangu, H Soltau - US Patent 9,196,243, 2015 - Google Patents
US9196243B2 - Method and system for efficient spoken term detection using confusion
networks - Google Patents US9196243B2 - Method and system for efficient spoken term …

Spoken content retrieval—beyond cascading speech recognition with text retrieval

L Lee, J Glass, H Lee, C Chan - IEEE/ACM Transactions on …, 2015 - ieeexplore.ieee.org
Spoken content retrieval refers to directly indexing and retrieving spoken content based on
the audio rather than text descriptions. This potentially eliminates the requirement of …

Multilingual representations for low resource speech recognition and keyword search

J Cui, B Kingsbury, B Ramabhadran… - 2015 IEEE workshop …, 2015 - ieeexplore.ieee.org
This paper examines the impact of multilingual (ML) acoustic representations on Automatic
Speech Recognition (ASR) and keyword search (KWS) for low resource languages in the …

End-to-end speech recognition and keyword search on low-resource languages

A Rosenberg, K Audhkhasi, A Sethy… - … on acoustics, speech …, 2017 - ieeexplore.ieee.org
In recent years, so-called,“end-to-end” speech recognition systems have emerged as viable
alternatives to traditional ASR frameworks. Keyword search, localizing an orthographic …

Investigation of multilingual deep neural networks for spoken term detection

KM Knill, MJF Gales, SP Rath… - … IEEE Workshop on …, 2013 - ieeexplore.ieee.org
The development of high-performance speech processing systems for low-resource
languages is a challenging area. One approach to address the lack of resources is to make …

System combination and score normalization for spoken term detection

J Mamou, J Cui, X Cui, MJF Gales… - … , Speech and Signal …, 2013 - ieeexplore.ieee.org
Spoken content in languages of emerging importance needs to be searchable to provide
access to the underlying information. In this paper, we investigate the problem of extending …

Structure discovery of deep neural network based on evolutionary algorithms

T Shinozaki, S Watanabe - 2015 IEEE international conference …, 2015 - ieeexplore.ieee.org
Deep neural networks (DNNs) are constructed by considering highly complicated
configurations including network structure and several tuning parameters (number of hidden …

[PDF][PDF] Data augmentation, feature combination, and multilingual neural networks to improve ASR and KWS performance for low-resource languages.

Z Tüske, P Golik, D Nolden, R Schlüter, H Ney - Interspeech, 2014 - academia.edu
This paper presents the progress of acoustic models for lowresourced languages
(Assamese, Bengali, Haitian Creole, Lao, Zulu) developed within the second evaluation …

Cumulative adaptation for BLSTM acoustic models

M Kitza, P Golik, R Schlüter, H Ney - arXiv preprint arXiv:1906.06207, 2019 - arxiv.org
This paper addresses the robust speech recognition problem as an adaptation task.
Specifically, we investigate the cumulative application of adaptation methods. A bidirectional …