相关文章- 学术资源搜索

SpeechBrain: A general-purpose speech toolkit

M Ravanelli, T Parcollet, P Plantinga, A Rouhe… - arXiv preprint arXiv …, 2021 - arxiv.org

SpeechBrain is an open-source and all-in-one speech toolkit. It is designed to facilitate the
research and development of neural speech processing technologies by being simple …

被引用次数：573 相关文章所有 5 个版本

[PDF] arxiv.org

Deep speech: Scaling up end-to-end speech recognition

A Hannun, C Case, J Casper, B Catanzaro… - arXiv preprint arXiv …, 2014 - arxiv.org

We present a state-of-the-art speech recognition system developed using end-to-end deep
learning. Our architecture is significantly simpler than traditional speech systems, which rely …

被引用次数：2606 相关文章所有 13 个版本

[PDF] sciencedirect.com

A review of deep learning techniques for speech processing

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier

The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

被引用次数：98 相关文章所有 6 个版本

[PDF] arxiv.org

Wav2letter++: A fast open-source speech recognition system

V Pratap, A Hannun, Q Xu, J Cai, J Kahn… - ICASSP 2019-2019 …, 2019 - ieeexplore.ieee.org

This paper introduces wav2letter++, a fast open-source deep learning speech recognition
framework. wav2letter++ is written entirely in C++, and uses the ArrayFire tensor library for …

被引用次数：226 相关文章所有 8 个版本

[PDF] arxiv.org

Fully convolutional speech recognition

N Zeghidour, Q Xu, V Liptchinsky, N Usunier… - arXiv preprint arXiv …, 2018 - arxiv.org

Current state-of-the-art speech recognition systems build on recurrent neural networks for
acoustic and/or language modeling, and rely on feature extraction pipelines to extract mel …

被引用次数：112 相关文章所有 4 个版本

[PDF] mlr.press

Deep voice: Real-time neural text-to-speech

SÖ Arık, M Chrzanowski, A Coates… - International …, 2017 - proceedings.mlr.press

Abstract We present Deep Voice, a production-quality text-to-speech system constructed
entirely from deep neural networks. Deep Voice lays the groundwork for truly end-to-end …

被引用次数：797 相关文章所有 5 个版本

[PDF] arxiv.org

ESPnet: End-to-end speech processing toolkit

S Watanabe, T Hori, S Karita, T Hayashi… - arXiv preprint arXiv …, 2018 - arxiv.org

This paper introduces a new open source platform for end-to-end speech processing named
ESPnet. ESPnet mainly focuses on end-to-end automatic speech recognition (ASR), and …

被引用次数：1539 相关文章所有 15 个版本

[PDF] arxiv.org

Specaugment: A simple data augmentation method for automatic speech recognition

DS Park, W Chan, Y Zhang, CC Chiu, B Zoph… - arXiv preprint arXiv …, 2019 - arxiv.org

We present SpecAugment, a simple data augmentation method for speech recognition.
SpecAugment is applied directly to the feature inputs of a neural network (ie, filter bank …

被引用次数：3869 相关文章所有 8 个版本

[PDF] psu.edu

[PDF][PDF] Exploring convolutional neural network structures and optimization techniques for speech recognition.

O Abdel-Hamid, L Deng, D Yu - Interspeech, 2013 - Citeseer

Recently, convolutional neural networks (CNNs) have been shown to outperform the
standard fully connected deep neural networks within the hybrid deep neural …

被引用次数：503 相关文章所有 7 个版本

[PDF] arxiv.org

Funcodec: A fundamental, reproducible and integrable open-source toolkit for neural speech codec

Z Du, S Zhang, K Hu, S Zheng - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org

This paper presents FunCodec, a fundamental neural speech codec toolkit, which is an
extension of the open-source speech processing toolkit FunASR. FunCodec provides …

被引用次数：16 相关文章所有 3 个版本

高级搜索

QQ 群

SpeechBrain: A general-purpose speech toolkit

Deep speech: Scaling up end-to-end speech recognition

A review of deep learning techniques for speech processing

Wav2letter++: A fast open-source speech recognition system

Fully convolutional speech recognition

Deep voice: Real-time neural text-to-speech

ESPnet: End-to-end speech processing toolkit

Specaugment: A simple data augmentation method for automatic speech recognition

[PDF][PDF] Exploring convolutional neural network structures and optimization techniques for speech recognition.

Funcodec: A fundamental, reproducible and integrable open-source toolkit for neural speech codec

引用