Generative adversarial network-based postfilter for STFT spectrograms

KB Bhangale, M Kothandaraman - Wireless Personal Communications, 2022 - Springer

Over the past decades, a particular focus is given to research on machine learning
techniques for speech processing applications. However, in the past few years, research …

被引用次数：94 相关文章所有 5 个版本

Generative adversarial networks for speech processing: A review

A Wali, Z Alamgir, S Karim, A Fawaz, MB Ali… - Computer Speech & …, 2022 - Elsevier

Generative adversarial networks (GANs) have seen remarkable progress in recent years.
They are used as generative models for all kinds of data such as text, images, audio, music …

被引用次数：63 相关文章所有 2 个版本

[PDF] arxiv.org

Stargan-vc: Non-parallel many-to-many voice conversion using star generative adversarial networks

H Kameoka, T Kaneko, K Tanaka… - 2018 IEEE Spoken …, 2018 - ieeexplore.ieee.org

This paper proposes a method that allows non-parallel many-to-many voice conversion (VC)
by using a variant of a generative adversarial network (GAN) called StarGAN. Our method …

被引用次数：504 相关文章所有 5 个版本

[PDF] arxiv.org

Cyclegan-vc2: Improved cyclegan-based non-parallel voice conversion

T Kaneko, H Kameoka, K Tanaka… - ICASSP 2019-2019 …, 2019 - ieeexplore.ieee.org

Non-parallel voice conversion (VC) is a technique for learning the mapping from source to
target speech without relying on parallel data. This is an important task, but it has been …

被引用次数：350 相关文章所有 7 个版本

[PDF] ntt.co.jp

Cyclegan-vc: Non-parallel voice conversion using cycle-consistent adversarial networks

T Kaneko, H Kameoka - 2018 26th European Signal …, 2018 - ieeexplore.ieee.org

We propose a non-parallel voice-conversion (VC) method that can learn a mapping from
source to target speech without relying on parallel data. The proposed method is particularly …

被引用次数：371 相关文章所有 8 个版本

[PDF] arxiv.org

Parallel-data-free voice conversion using cycle-consistent adversarial networks

T Kaneko, H Kameoka - arXiv preprint arXiv:1711.11293, 2017 - arxiv.org

We propose a parallel-data-free voice-conversion (VC) method that can learn a mapping
from source to target speech without relying on parallel data. The proposed method is …

被引用次数：277 相关文章所有 3 个版本

[PDF] semanticscholar.org

Time-frequency masking-based speech enhancement using generative adversarial network

MH Soni, N Shah, HA Patil - 2018 IEEE international …, 2018 - ieeexplore.ieee.org

The success of time-frequency (TF) mask-based approaches is dependent on the accuracy
of predicted mask given the noisy spectral features. The state-of-the-art methods in TF …

被引用次数：258 相关文章所有 6 个版本

[PDF] arxiv.org

Stargan-vc2: Rethinking conditional methods for stargan-based voice conversion

T Kaneko, H Kameoka, K Tanaka, N Hojo - arXiv preprint arXiv …, 2019 - arxiv.org

Non-parallel multi-domain voice conversion (VC) is a technique for learning mappings
among multiple domains without relying on parallel data. This is important but challenging …

被引用次数：183 相关文章所有 7 个版本

[PDF] thecvf.com

Av-rir: Audio-visual room impulse response estimation

A Ratnarajah, S Ghosh, S Kumar… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract Accurate estimation of Room Impulse Response (RIR) which captures an
environment's acoustic properties is important for speech processing and AR/VR …

被引用次数：10 相关文章所有 4 个版本

[PDF] ntt.co.jp

[PDF][PDF] Sequence-to-Sequence Voice Conversion with Similarity Metric Learned Using Generative Adversarial Networks.

T Kaneko, H Kameoka, K Hiramatsu, K Kashino - Interspeech, 2017 - kecl.ntt.co.jp

We propose a training framework for sequence-to-sequence voice conversion (SVC). A well-
known problem regarding a conventional VC framework is that acoustic-feature sequences …

被引用次数：138 相关文章所有 3 个版本

高级搜索

QQ 群