The MGB challenge: Evaluating multi-genre broadcast media recognition

MM Kabir, MF Mridha, J Shin, I Jahan, AQ Ohi - IEEE Access, 2021 - ieeexplore.ieee.org

Humans can identify a speaker by listening to their voice, over the telephone, or on any
digital devices. Acquiring this congenital human competency, authentication technologies …

被引用次数：119 相关文章所有 4 个版本

[PDF] ieee.org

Adaptation algorithms for neural network-based speech recognition: An overview

P Bell, J Fainberg, O Klejch, J Li… - IEEE Open Journal …, 2020 - ieeexplore.ieee.org

We present a structured overview of adaptation algorithms for neural network-based speech
recognition, considering both hybrid hidden Markov model/neural network systems and end …

被引用次数：99 相关文章所有 7 个版本

[HTML] sciencedirect.com

[HTML][HTML] Voxceleb: Large-scale speaker verification in the wild

A Nagrani, JS Chung, W Xie, A Zisserman - Computer Speech & Language, 2020 - Elsevier

The objective of this work is speaker recognition under noisy and unconstrained conditions.
We make two key contributions. First, we introduce a very large-scale audio-visual dataset …

被引用次数：787 相关文章所有 11 个版本

[PDF] arxiv.org

Voxceleb: a large-scale speaker identification dataset

A Nagrani, JS Chung, A Zisserman - arXiv preprint arXiv:1706.08612, 2017 - arxiv.org

Most existing datasets for speaker identification contain samples obtained under quite
constrained conditions, and are usually hand-annotated, hence limited in size. The goal of …

被引用次数：2807 相关文章所有 15 个版本

[PDF] arxiv.org

The fifth'CHiME'speech separation and recognition challenge: dataset, task and baselines

J Barker, S Watanabe, E Vincent, J Trmal - arXiv preprint arXiv …, 2018 - arxiv.org

The CHiME challenge series aims to advance robust automatic speech recognition (ASR)
technology by promoting research at the interface of speech and language processing …

被引用次数：433 相关文章所有 11 个版本

[PDF] hal.science

An analysis of environment, microphone and data simulation mismatches in robust speech recognition

E Vincent, S Watanabe, AA Nugraha, J Barker… - Computer Speech & …, 2017 - Elsevier

Speech enhancement and automatic speech recognition (ASR) are most often evaluated in
matched (or multi-condition) settings where the acoustic conditions of the training data …

被引用次数：429 相关文章所有 16 个版本

[PDF] github.io

[PDF][PDF] The speakers in the wild (SITW) speaker recognition database.

M McLaren, L Ferrer, D Castan, A Lawson - Interspeech, 2016 - maelfabien.github.io

Abstract The Speakers in the Wild (SITW) speaker recognition database contains hand-
annotated speech samples from open-source media for the purpose of benchmarking text …

被引用次数：325 相关文章所有 7 个版本

[PDF] arxiv.org

Speech recognition challenge in the wild: Arabic MGB-3

A Ali, S Vogel, S Renals - 2017 IEEE Automatic Speech …, 2017 - ieeexplore.ieee.org

This paper describes the Arabic MGB-3 Challenge-Arabic Speech Recognition in the Wild.
Unlike last year's Arabic MGB-2 Challenge, for which the recognition task was based on …

被引用次数：125 相关文章所有 5 个版本

[PDF] arxiv.org

Capitalization and punctuation restoration: a survey

V Păiş, D Tufiş - Artificial Intelligence Review, 2022 - Springer

Ensuring proper punctuation and letter casing is a key pre-processing step towards applying
complex natural language processing algorithms. This is especially significant for textual …

被引用次数：34 相关文章所有 8 个版本

[PDF] researchgate.net

Self-attention based model for punctuation prediction using word and speech embeddings

J Yi, J Tao - ICASSP 2019-2019 IEEE International Conference …, 2019 - ieeexplore.ieee.org

This paper proposes to use self-attention based model to predict punctuation marks for word
sequences. The model is trained using word and speech embedding features which are …

被引用次数：74 相关文章所有 2 个版本

高级搜索

QQ 群