Automatic speech recognition using advanced deep learning approaches: A survey

H Kheddar, M Hemis, Y Himeur - Information Fusion, 2024 - Elsevier
Recent advancements in deep learning (DL) have posed a significant challenge for
automatic speech recognition (ASR). ASR relies on extensive training datasets, including …

Decoupling-style monaural speech enhancement with a triple-branch cross-domain fusion network

W Chen, R Yu, Z Ye - Applied Acoustics, 2024 - Elsevier
Monaural speech enhancement aims to remove background noise from noisy speech
signals captured by a single microphone. In recent years, several cross-domain monaural …

Edge-guided two-stage feature matching for infrared and visible image registration in electric power scenes

C Xu, Q Li, Y Shen, C Chang, Y Zhou - Infrared Physics & Technology, 2024 - Elsevier
Collaborative processing of infrared and visible (IR–VS) images is essential in identifying
potential equipment failures during power inspections. However, due to significant …

A GPU-accelerated real-time human voice separation framework for mobile phones

G Chen, Y Zheng, Z Zhou, S He, W Yi - Journal of Systems Architecture, 2023 - Elsevier
Mobile speech communication can experience significant degradation in quality when users
are in a noisy acoustic environment. With the rapid development of artificial intelligence in …

IIFC-Net: A Monaural Speech Enhancement Network With High-Order Information Interaction and Feature Calibration

W Wei, Y Hu, H Huang, L He - IEEE Signal Processing Letters, 2023 - ieeexplore.ieee.org
Recently, many Transformer-style dual-path models have achieved impressive performance
for speech enhancement. However, their high parameters and computational complexity …

Unrestricted Global Phase Bias-Aware Single-Channel Speech Enhancement with Conformer-Based Metric Gan

S Zhang, Z Qiu, D Takeuchi, N Harada… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org
With the rapid development of neural networks in recent years, the ability of various
networks to enhance the magnitude spectrum of noisy speech in the single-channel speech …

A Lightweight Music Source Separation Model with Graph Convolution Network

M Zhu, L Wang, Y Hu - National Conference on Man-Machine Speech …, 2023 - Springer
With the rapid advancement of deep neural networks, there has been a significant
improvement in the performance of music source separation methods. However, most of …

[PDF][PDF] SOUND EVENT LOCALIZATION AND DETECTION BASED ON OMI-DIMENSIONAL DYNAMIC CONVOLUTION AND FEATURE PYRAMID ATTENTION …

M Ma, Y Hu, M Wang, W Fang, J Liu, Z Niu, X Fan - dcase.community
In this report, we present our method for Detection and Classification of Acoustic Scenes and
Events (DCASE) 2023 challenge task3: Sound Event Localization and Detection Evaluated …