High-quality speech coding with sample RNN

YC Wu, ID Gebru, D Marković… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org

A good audio codec for live applications such as telecommunication is characterized by
three key properties:(1) compression, ie the bitrate that is required to transmit the signal …

被引用次数：33 相关文章所有 3 个版本

[PDF] arxiv.org

ViSQOL v3: An open source production ready objective speech and audio metric

M Chinen, FSC Lim, J Skoglund… - … on quality of …, 2020 - ieeexplore.ieee.org

Estimation of perceptual quality in audio and speech is possible using a variety of methods.
The combined v3 release of ViSQOL and ViSQOLAudio (for speech and audio …

被引用次数：114 相关文章所有 9 个版本

Speech coding techniques and challenges: A comprehensive literature survey

M Anees - Multimedia Tools and Applications, 2024 - Springer

Speech coding is the process of compressing speech signals for transmission and storage
in communication systems. In recent years, speech coding has become increasingly …

被引用次数：4 相关文章

[PDF] arxiv.org

Generative speech coding with predictive variance regularization

WB Kleijn, A Storus, M Chinen, T Denton… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org

The recent emergence of machine-learning based generative models for speech suggests a
significant reduction in bit rate for speech codecs is possible. However, the performance of …

被引用次数：64 相关文章所有 2 个版本

[PDF] springer.com

Review of methods for coding of speech signals

D O'Shaughnessy - EURASIP Journal on Audio, Speech, and Music …, 2023 - Springer

Speech is the most common form of human communication, and many conversations use
digital communication links. For efficient transmission, acoustic speech waveforms are …

被引用次数：7 相关文章所有 7 个版本

[PDF] arxiv.org

A real-time wideband neural vocoder at 1.6 kb/s using LPCNet

JM Valin, J Skoglund - arXiv preprint arXiv:1903.12087, 2019 - arxiv.org

Neural speech synthesis algorithms are a promising new approach for coding speech at
very low bitrate. They have so far demonstrated quality that far exceeds traditional vocoders …

被引用次数：89 相关文章所有 14 个版本

[PDF] arxiv.org

End-to-end neural speech coding for real-time communications

X Jiang, X Peng, C Zheng, H Xue… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org

Deep-learning based methods have shown their advantages in audio coding over traditional
ones but limited attention has been paid on real-time communications (RTC). This paper …

被引用次数：21 相关文章所有 4 个版本

[PDF] arxiv.org

Boomerang: Local sampling on image manifolds using diffusion models

L Luzi, A Siahkoohi, PM Mayer… - arXiv preprint arXiv …, 2022 - arxiv.org

Diffusion models can be viewed as mapping points in a high-dimensional latent space onto
a low-dimensional learned manifold, typically an image manifold. The intermediate values …

被引用次数：13 相关文章所有 4 个版本

[PDF] arxiv.org

Improving Opus low bit rate quality with neural speech synthesis

J Skoglund, JM Valin - arXiv preprint arXiv:1905.04628, 2019 - arxiv.org

The voice mode of the Opus audio coder can compress wideband speech at bit rates
ranging from 6 kb/s to 40 kb/s. However, Opus is at its core a waveform matching coder, and …

被引用次数：43 相关文章所有 12 个版本

[PDF] arxiv.org

APCodec: A Neural Audio Codec with Parallel Amplitude and Phase Spectrum Encoding and Decoding

Y Ai, XH Jiang, YX Lu, HP Du, ZH Ling - arXiv preprint arXiv:2402.10533, 2024 - arxiv.org

This paper introduces a novel neural audio codec targeting high waveform sampling rates
and low bitrates named APCodec, which seamlessly integrates the strengths of parametric …

被引用次数：3 相关文章所有 2 个版本

高级搜索

QQ 群