[PDF][PDF] On speaker adaptation of long short-term memory recurrent neural networks

Y Miao, F Metze - Sixteenth Annual Conference of the International …, 2015 - isca-archive.org
Abstract Long Short-Term Memory (LSTM) is a recurrent neural network (RNN) architecture
specializing in modeling long-range temporal dynamics. On acoustic modeling tasks, LSTM …

Spoken term detection ALBAYZIN 2014 evaluation: overview, systems, results, and discussion

J Tejedor, DT Toledano, P Lopez-Otero… - EURASIP Journal on …, 2015 - Springer
Spoken term detection (STD) aims at retrieving data from a speech repository given a textual
representation of the search term. Nowadays, it is receiving much interest due to the large …

[PDF][PDF] Distributed learning of multilingual DNN feature extractors using GPUs

Y Miao, H Zhang, F Metze - Fifteenth Annual Conference of the …, 2014 - isca-archive.org
Multilingual deep neural networks (DNNs) can act as deep feature extractors and have been
applied successfully to crosslanguage acoustic modeling. Learning these feature extractors …

[PDF][PDF] Improving language-universal feature extraction with deep maxout and convolutional neural networks

Y Miao, F Metze - Fifteenth Annual Conference of the International …, 2014 - isca-archive.org
When deployed in automated speech recognition (ASR), deep neural networks (DNNs) can
be treated as a complex feature extractor plus a simple linear classifier. Previous work has …

Efficient query-by-example spoken document retrieval combining phone multigram representation and dynamic time warping

P Lopez-Otero, J Parapar, A Barreiro - Information Processing & …, 2019 - Elsevier
Query-by-example spoken document retrieval (QbESDR) aims at finding those documents in
a set that include a given spoken query. Current approaches are, in general, not valid for …

Combination of search techniques for improved spotting of OOV keywords

D Karakos, RM Schwartz - 2015 IEEE International Conference …, 2015 - ieeexplore.ieee.org
The most common pipelines in keyword spotting involve some kind of speech recognition,
which leads to the generation of sets of plausible hypotheses (eg, word lattices), which are …

Statistical language models for query-by-example spoken document retrieval

P Lopez-Otero, J Parapar, A Barreiro - Multimedia Tools and Applications, 2020 - Springer
Query-by-example spoken document retrieval (QbESDR) consists in, given a collection of
documents, computing how likely a spoken query is present in each document. This is …

采用词图相交融合的语音关键词检测方法

李鹏, 屈丹 - 信号处理, 2015 - signal.ejournal.org.cn
针对词图合并方法产生的词图冗余信息过多, 规模较大, 导致检索速度较慢的问题,
本文提出了一种基于词图相交融合的语音关键词检测方法. 首先, 将不同语音识别系统产生的词 …

[PDF][PDF] Resource-dependent acoustic and language modeling for spoken keyword search.

IF Chen - 2016 - core.ac.uk
Spoken keyword search (KWS) is a task to detect a set of keywords in continuous speech. It
could be considered as an application of automatic speech recognition (ASR) focusing only …

Speech Retrieval Method Based on Index Fusion and Pseudo Correlation Feedback

Y Wang, S Wang - 2024 5th International Conference on …, 2024 - ieeexplore.ieee.org
With the advent of the big data era, the amount of audio data is rapidly increasing. In order to
effectively utilize this information, we urgently need an efficient method for retrieving speech …