Deep spoken keyword spotting: An overview

I López-Espejo, ZH Tan, JHL Hansen, J Jensen - IEEE Access, 2021 - ieeexplore.ieee.org
Spoken keyword spotting (KWS) deals with the identification of keywords in audio streams
and has become a fast-growing technology thanks to the paradigm shift introduced by deep …

Specaugment: A simple data augmentation method for automatic speech recognition

DS Park, W Chan, Y Zhang, CC Chiu, B Zoph… - arXiv preprint arXiv …, 2019 - arxiv.org
We present SpecAugment, a simple data augmentation method for speech recognition.
SpecAugment is applied directly to the feature inputs of a neural network (ie, filter bank …

Generalized end-to-end loss for speaker verification

L Wan, Q Wang, A Papir… - 2018 IEEE International …, 2018 - ieeexplore.ieee.org
In this paper, we propose a new loss function called generalized end-to-end (GE2E) loss,
which makes the training of speaker verification models more efficient than our previous …

Speaker diarization with LSTM

Q Wang, C Downey, L Wan… - … on acoustics, speech …, 2018 - ieeexplore.ieee.org
For many years, i-vector based audio embedding techniques were the dominant approach
for speaker verification and speaker diarization applications. However, mirroring the rise of …

End-to-end text-dependent speaker verification

G Heigold, I Moreno, S Bengio… - 2016 IEEE International …, 2016 - ieeexplore.ieee.org
In this paper we present a data-driven, integrated approach to speaker verification, which
maps a test utterance and a few reference utterances directly to a single score for verification …

Specaugment on large scale datasets

DS Park, Y Zhang, CC Chiu, Y Chen… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org
Recently, SpecAugment, an augmentation scheme for automatic speech recognition that
acts directly on the spectrogram of input utterances, has shown to be highly effective in …

Attention-based models for text-dependent speaker verification

FAR rahman Chowdhury, Q Wang… - … , Speech and Signal …, 2018 - ieeexplore.ieee.org
Attention-based models have recently shown great performance on a range of tasks, such
as speech recognition, machine translation, and image captioning due to their ability to …

Target electromagnetic detection method in underground environment: A review

X Wang, P Wang, X Zhang, Y Wan, H Shi… - IEEE Sensors …, 2022 - ieeexplore.ieee.org
Underground target detection plays an important role in maintaining underground space
security and relies on efficient real-time detection methods. Electromagnetic detection …

Convolutional recurrent neural networks for small-footprint keyword spotting

SO Arik, M Kliegl, R Child, J Hestness… - arXiv preprint arXiv …, 2017 - arxiv.org
Keyword spotting (KWS) constitutes a major component of human-technology interfaces.
Maximizing the detection accuracy at a low false alarm (FA) rate, while minimizing the …

Trainable frontend for robust and far-field keyword spotting

Y Wang, P Getreuer, T Hughes, RF Lyon… - … , Speech and Signal …, 2017 - ieeexplore.ieee.org
Robust and far-field speech recognition is critical to enable true hands-free communication.
In far-field conditions, signals are attenuated due to distance. To improve robustness to …