A fast Griffin-Lim algorithm

[HTML][HTML] Brain-computer interface: applications to speech decoding and synthesis to augment communication

S Luo, Q Rabbani, NE Crone - Neurotherapeutics, 2022 - Elsevier

Damage or degeneration of motor pathways necessary for speech and other movements, as
in brainstem strokes or amyotrophic lateral sclerosis (ALS), can interfere with efficient …

被引用次数：74 相关文章所有 12 个版本

[PDF] arxiv.org

ESPnet-TTS: Unified, reproducible, and integratable open source end-to-end text-to-speech toolkit

T Hayashi, R Yamamoto, K Inoue… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org

This paper introduces a new end-to-end text-to-speech (E2E-TTS) toolkit named ESPnet-
TTS, which is an extension of the open-source speech processing toolkit ESPnet. The toolkit …

被引用次数：246 相关文章所有 7 个版本

[PDF] audiolabs-erlangen.de

[图书][B] Fundamentals of music processing: Audio, analysis, algorithms, applications

M Müller - 2015 - Springer

This textbook provides both profound technological knowledge and a comprehensive
treatment of essential topics in music processing and music information retrieval. Including …

被引用次数：763 相关文章所有 9 个版本

[PDF] arxiv.org

Asteroid: the PyTorch-based audio source separation toolkit for researchers

M Pariente, S Cornell, J Cosentino… - arXiv preprint arXiv …, 2020 - arxiv.org

This paper describes Asteroid, the PyTorch-based audio source separation toolkit for
researchers. Inspired by the most successful neural source separation systems, it provides …

被引用次数：171 相关文章所有 11 个版本

[HTML] mdpi.com

[HTML][HTML] Video and audio deepfake datasets and open issues in deepfake technology: being ahead of the curve

Z Akhtar, TL Pendyala, VS Athmakuri - Forensic Sciences, 2024 - mdpi.com

The revolutionary breakthroughs in Machine Learning (ML) and Artificial Intelligence (AI) are
extensively being harnessed across a diverse range of domains, eg, forensic science …

被引用次数：5 相关文章

[PDF] arxiv.org

Fast spectrogram inversion using multi-head convolutional neural networks

SÖ Arık, H Jun, G Diamos - IEEE Signal Processing Letters, 2018 - ieeexplore.ieee.org

We propose the multi-head convolutional neural network (MCNN) for waveform synthesis
from spectrograms. Nonlinear interpolation in MCNN is employed with transposed …

被引用次数：143 相关文章所有 4 个版本

[PDF] arxiv.org

CFAD: A Chinese dataset for fake audio detection

H Ma, J Yi, C Wang, X Yan, J Tao, T Wang… - Speech …, 2024 - Elsevier

Fake audio detection is a growing concern and some relevant datasets have been designed
for research. However, there is no standard public Chinese dataset under complex …

被引用次数：37 相关文章所有 3 个版本

Generation and detection of manipulated multimodal audiovisual content: Advances, trends and open challenges

H Liz-Lopez, M Keita, A Taleb-Ahmed, A Hadid… - Information …, 2024 - Elsevier

Generative deep learning techniques have invaded the public discourse recently. Despite
the advantages, the applications to disinformation are concerning as the counter-measures …

被引用次数：21 相关文章所有 4 个版本

[PDF] researchgate.net

A context encoder for audio inpainting

A Marafioti, N Perraudin, N Holighaus… - … /ACM Transactions on …, 2019 - ieeexplore.ieee.org

In this article, we study the ability of deep neural networks (DNNs) to restore missing audio
content based on its context, ie, inpaint audio gaps. We focus on a condition which has not …

被引用次数：86 相关文章所有 7 个版本

[PDF] usenix.org

{KENKU}: Towards Efficient and Stealthy Black-box Adversarial Attacks against {ASR} Systems

X Wu, S Ma, C Shen, C Lin, Q Wang, Q Li… - 32nd USENIX Security …, 2023 - usenix.org

Prior researchers show that existing automatic speech recognition (ASR) systems are
vulnerable to adversarial examples. Most existing adversarial attacks against ASR systems …

被引用次数：9 相关文章所有 4 个版本

高级搜索

QQ 群

[HTML][HTML] Brain-computer interface: applications to speech decoding and synthesis to augment communication

ESPnet-TTS: Unified, reproducible, and integratable open source end-to-end text-to-speech toolkit

[图书][B] Fundamentals of music processing: Audio, analysis, algorithms, applications

Asteroid: the PyTorch-based audio source separation toolkit for researchers

[HTML][HTML] Video and audio deepfake datasets and open issues in deepfake technology: being ahead of the curve

Fast spectrogram inversion using multi-head convolutional neural networks

CFAD: A Chinese dataset for fake audio detection

Generation and detection of manipulated multimodal audiovisual content: Advances, trends and open challenges

A context encoder for audio inpainting

{KENKU}: Towards Efficient and Stealthy Black-box Adversarial Attacks against {ASR} Systems

引用