Phasen: A phase-and-harmonics-aware speech enhancement network

D Yin, C Luo, Z Xiong, W Zeng - Proceedings of the AAAI Conference on …, 2020 - aaai.org
Time-frequency (TF) domain masking is a mainstream approach for single-channel speech
enhancement. Recently, focuses have been put to phase prediction in addition to amplitude …

Advances in phase-aware signal processing in speech communication

P Mowlaee, R Saeidi, Y Stylianou - Speech communication, 2016 - Elsevier
During the past three decades, the issue of processing spectral phase has been largely
neglected in speech applications. There is no doubt that the interest of speech processing …

[PDF][PDF] PhaseNet: Discretized Phase Modeling with Deep Neural Networks for Audio Source Separation.

N Takahashi, P Agrawal, N Goswami, Y Mitsufuji - Interspeech, 2018 - isca-archive.org
Previous research on audio source separation based on deep neural networks (DNNs)
mainly focuses on estimating the magnitude spectrum of target sources and typically, phase …

[HTML][HTML] Phase-aware deep speech enhancement: It's all about the frame length

T Peer, T Gerkmann - JASA Express Letters, 2022 - pubs.aip.org
Algorithmic latency in speech processing is dominated by the frame length used for Fourier
analysis, which in turn limits the achievable performance of magnitude-centric approaches …

Compact deep neural networks for real-time speech enhancement on resource-limited devices

FE Wahab, Z Ye, N Saleem, R Ullah - Speech Communication, 2024 - Elsevier
In real-time applications, the aim of speech enhancement (SE) is to achieve optimal
performance while ensuring computational efficiency and near-instant outputs. Many deep …

Deep Griffin–Lim iteration: Trainable iterative phase reconstruction using neural network

Y Masuyama, K Yatabe, Y Koizumi… - IEEE Journal of …, 2020 - ieeexplore.ieee.org
In this paper, we propose a phase reconstruction framework, named Deep Griffin-Lim
Iteration (DeGLI). Phase reconstruction is a fundamental technique for improving the quality …

[PDF][PDF] An appraisal on speech and emotion recognition technologies based on machine learning

CA Jason, S Kumar - language, 2020 - academia.edu
In earlier days, people used speech as a means of communication or the way a listener is
conveyed by voice or expression. But the idea of machine learning and various methods are …

Single-channel speech enhancement with phase reconstruction based on phase distortion averaging

Y Wakabayashi, T Fukumori… - … on Audio, Speech …, 2018 - ieeexplore.ieee.org
Speech enhancement has been widely investigated for several decades, but by modifying
only the amplitude spectrum of a speech signal, ignoring the phase spectrum, which has …

Online phase reconstruction via DNN-based phase differences estimation

Y Masuyama, K Yatabe, K Nagatomo… - … /ACM Transactions on …, 2022 - ieeexplore.ieee.org
This paper presents a two-stage online phase reconstruction framework using causal deep
neural networks (DNNs). Phase reconstruction is a task of recovering phase of the short-time …

On DCT-based MMSE estimation of short time spectral amplitude for single-channel speech enhancement

S Shi, K Paliwal, A Busch - Applied Acoustics, 2023 - Elsevier
Abstract This paper proposes Discrete Cosine Transform (DCT) based speech
enhancement algorithms. These algorithms utilize minimum mean square error (MMSE) …