Waveform based speech coding using nonlinear predictive techniques: a systematic review

GK Sheferaw, W Mwangi, M Kimwele… - International Journal of …, 2023 - Springer
Speech coding is a technique that compresses speech signals into a smaller digital form,
making it easier to transmit or store, while still maintaining the quality and intelligibility of the …

Latent-domain predictive neural speech coding

X Jiang, X Peng, H Xue, Y Zhang… - IEEE/ACM Transactions …, 2023 - ieeexplore.ieee.org
Neural audio/speech coding has recently demonstrated its capability to deliver high quality
at much lower bitrates than traditional methods. However, existing neural audio/speech …

Sub-8-bit quantization for on-device speech recognition: A regularization-free approach

K Zhen, M Radfar, H Nguyen, GP Strimel… - 2022 IEEE Spoken …, 2023 - ieeexplore.ieee.org
For on-device automatic speech recognition (ASR), quantization aware training (QAT) is
ubiquitous to achieve the trade-off between model predictive performance and efficiency …

Disentangled feature learning for real-time neural speech coding

X Jiang, X Peng, Y Zhang, Y Lu - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org
Recently end-to-end neural audio/speech coding has shown its great potential to outperform
traditional signal analysis based audio codecs. This is mostly achieved by following the VQ …

Neural feature predictor and discriminative residual coding for low-bitrate speech coding

H Yang, W Lim, M Kim - ICASSP 2023-2023 IEEE International …, 2023 - ieeexplore.ieee.org
Low and ultra-low-bitrate neural speech codecs achieved unprecedented coding gain by
generating speech signals from compact features. This paper introduces additional coding …

End-to-end neural audio coding in the mdct domain

H Lim, J Lee, BH Kim, I Jang… - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org
Modern deep neural network (DNN)-based audio coding approaches utilize complicated
non-linear functions (eg, convolutional neural networks and non-linear activations), which …

Gull: A Generative Multifunctional Audio Codec

Y Luo, J Yu, H Chen, R Gu, C Weng - arXiv preprint arXiv:2404.04947, 2024 - arxiv.org
We introduce Gull, a generative multifunctional audio codec. Gull is a general purpose
neural audio compression and decompression model which can be applied to a wide range …

Alias-and-separate: Wideband speech coding using sub-Nyquist sampling and speech separation

S Hwang, E Lee, I Jang, JW Shin - IEEE Signal Processing …, 2022 - ieeexplore.ieee.org
Decimation of a discrete-time signal below the Nyquist rate without applying an appropriate
lowpass filter results in a distortion called aliasing. If wideband speech sampled at 16 kHz is …

Simple and Efficient Quantization Techniques for Neural Speech Coding

A Brendel, N Pia, K Gupta, G Fuchs… - arXiv preprint arXiv …, 2024 - arxiv.org
Neural audio coding has emerged as a vivid research direction by promising good audio
quality at very low bitrates unachievable by classical coding techniques. Here, end-to-end …

Highly efficient audio coding with blind spectral recovery based on machine learning

JW Kim, SK Beack, W Lim… - IEEE Signal Processing …, 2022 - ieeexplore.ieee.org
This letter proposes a new method for audio coding that utilizes blind spectral recovery to
improve the coding efficiency without compromising performance. The proposed method …