Automatic gain control and multi-style training for robust small-footprint keyword spotting...

I López-Espejo, ZH Tan, JHL Hansen, J Jensen - IEEE Access, 2021 - ieeexplore.ieee.org

Spoken keyword spotting (KWS) deals with the identification of keywords in audio streams
and has become a fast-growing technology thanks to the paradigm shift introduced by deep …

被引用次数：139 相关文章所有 7 个版本

[PDF] arxiv.org

Specaugment: A simple data augmentation method for automatic speech recognition

DS Park, W Chan, Y Zhang, CC Chiu, B Zoph… - arXiv preprint arXiv …, 2019 - arxiv.org

We present SpecAugment, a simple data augmentation method for speech recognition.
SpecAugment is applied directly to the feature inputs of a neural network (ie, filter bank …

被引用次数：4398 相关文章所有 8 个版本

[PDF] arxiv.org

Generalized end-to-end loss for speaker verification

L Wan, Q Wang, A Papir… - 2018 IEEE International …, 2018 - ieeexplore.ieee.org

In this paper, we propose a new loss function called generalized end-to-end (GE2E) loss,
which makes the training of speaker verification models more efficient than our previous …

被引用次数：1128 相关文章所有 10 个版本

[PDF] arxiv.org

Speaker diarization with LSTM

Q Wang, C Downey, L Wan… - … on acoustics, speech …, 2018 - ieeexplore.ieee.org

For many years, i-vector based audio embedding techniques were the dominant approach
for speaker verification and speaker diarization applications. However, mirroring the rise of …

被引用次数：440 相关文章所有 11 个版本

[PDF] arxiv.org

End-to-end text-dependent speaker verification

G Heigold, I Moreno, S Bengio… - 2016 IEEE International …, 2016 - ieeexplore.ieee.org

In this paper we present a data-driven, integrated approach to speaker verification, which
maps a test utterance and a few reference utterances directly to a single score for verification …

被引用次数：810 相关文章所有 14 个版本

[PDF] arxiv.org

Specaugment on large scale datasets

DS Park, Y Zhang, CC Chiu, Y Chen… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org

Recently, SpecAugment, an augmentation scheme for automatic speech recognition that
acts directly on the spectrogram of input utterances, has shown to be highly effective in …

被引用次数：173 相关文章所有 5 个版本

[PDF] arxiv.org

Attention-based models for text-dependent speaker verification

FAR rahman Chowdhury, Q Wang… - … , Speech and Signal …, 2018 - ieeexplore.ieee.org

Attention-based models have recently shown great performance on a range of tasks, such
as speech recognition, machine translation, and image captioning due to their ability to …

被引用次数：210 相关文章所有 10 个版本

Target electromagnetic detection method in underground environment: A review

X Wang, P Wang, X Zhang, Y Wan, H Shi… - IEEE Sensors …, 2022 - ieeexplore.ieee.org

Underground target detection plays an important role in maintaining underground space
security and relies on efficient real-time detection methods. Electromagnetic detection …

被引用次数：27 相关文章所有 2 个版本

[PDF] arxiv.org

Convolutional recurrent neural networks for small-footprint keyword spotting

SO Arik, M Kliegl, R Child, J Hestness… - arXiv preprint arXiv …, 2017 - arxiv.org

Keyword spotting (KWS) constitutes a major component of human-technology interfaces.
Maximizing the detection accuracy at a low false alarm (FA) rate, while minimizing the …

被引用次数：178 相关文章所有 8 个版本

[PDF] arxiv.org

Trainable frontend for robust and far-field keyword spotting

Y Wang, P Getreuer, T Hughes, RF Lyon… - … , Speech and Signal …, 2017 - ieeexplore.ieee.org

Robust and far-field speech recognition is critical to enable true hands-free communication.
In far-field conditions, signals are attenuated due to distance. To improve robustness to …

被引用次数：183 相关文章所有 9 个版本

高级搜索

QQ 群