Hybrid Attention Time-Frequency Analysis Network for Single-Channel Speech Enhancement- 学术资源搜索

Hybrid Attention Time-Frequency Analysis Network for Single-Channel Speech Enhancement

Z Zhang, X Liang, R Xu, M Wang - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org

Z Zhang, X Liang, R Xu, M Wang

ICASSP 2024-2024 IEEE International Conference on Acoustics …, 2024•ieeexplore.ieee.org

The time-frequency domain remains central to the speech signal analysis. Enhancing the efficacy of neural network-based speech models demands a detailed multi-scale analysis of time-frequency features. This study presents the Hybrid Attention Time-Frequency Analysis Network (HATFANet), an innovative model that uses a dual-branch structure to concurrently estimate the ideal ratio mask and the enhanced complex spectrum. Each branch incorporates Hybrid Attention Blocks (HABs) to capture local, global, and inter-window attention for more effective deep feature extraction by employing reshaping techniques and gated multi-layer perceptrons to focus on different attention scales. The addition of residual channel attention and window multi-head self-attention mechanism accentuate channel attention features and intra-window attention. Our experiments verify the pivotal role of these HABs across varied attentional scales. HATFANet achieves state-of-the-art results on the Voice Bank + DEMAND dataset, recording 3.37 PESQ, 95.8% STOI, and 10.15 SSNR.

ieeexplore.ieee.org

展开收起

被引用次数：2 相关文章

以上显示的是最相近的搜索结果。查看全部搜索结果

高级搜索

QQ 群

Hybrid Attention Time-Frequency Analysis Network for Single-Channel Speech Enhancement

引用