相关文章- 学术资源搜索

[HTML][HTML] A speech separation system in video sequence using dilated inception network and U-Net

G Dahy, MAA Refaey, R Alkhoribi, M Shoman - Egyptian Informatics Journal, 2022 - Elsevier

In this paper, an audio-visual model for separating a speech of the target speaker from a
combination of other speakers' speeches is proposed. It can be used in speech separation …

被引用次数：3 相关文章

[PDF] springer.com

Audio visual speech source separation via improved context dependent association model

A Kazemi, R Boostani, F Sobhanmanesh - EURASIP Journal on Advances …, 2014 - Springer

In this paper, we exploit the non-linear relation between a speech source and its associated
lip video as a source of extra information to propose an improved audio-visual speech …

被引用次数：7 相关文章所有 10 个版本

[PDF] jst.go.jp

Speaker-Independent Audio-Visual Speech Separation Based on Transformer in Multi-Talker Environments

J Wang, Y Luo, W Yi, X Xie - IEICE TRANSACTIONS on Information …, 2022 - search.ieice.org

Speech separation is the task of extracting target speech while suppressing background
interference components. In applications like video telephones, visual information about the …

被引用次数：2 相关文章所有 5 个版本

Time-domain audio-visual speech separation on low quality videos

Y Wu, C Li, J Bai, Z Wu, Y Qian - ICASSP 2022-2022 IEEE …, 2022 - ieeexplore.ieee.org

Incorporating visual information is a promising approach to improve the performance of
speech separation. Many related works have been conducted and provide inspiring results …

被引用次数：10 相关文章

[PDF] arxiv.org

Integration of deep learning with expectation maximization for spatial cue-based speech separation in reverberant conditions

S Gul, MS Khan, SW Shah - Applied Acoustics, 2021 - Elsevier

In this paper, we formulate a blind source separation (BSS) framework, which allows
integrating U-Net based deep learning source separation network with probabilistic spatial …

被引用次数：11 相关文章所有 3 个版本

[PDF] researchgate.net

[PDF][PDF] Multi-Stream Gated and Pyramidal Temporal Convolutional Neural Networks for Audio-Visual Speech Separation in Multi-Talker Environments.

Y Luo, J Wang, L Xu, L Yang - Interspeech, 2021 - researchgate.net

Speech separation is the task of extracting target speech from noisy mixture. In applications
like video telephones or video conferencing, lip movements of the target speaker are …

被引用次数：7 相关文章所有 5 个版本

[PDF] ieee.org

Speech segregation in background noise based on deep learning

JB Awotunde, RO Ogundokun, FE Ayo… - IEEE Access, 2020 - ieeexplore.ieee.org

The most important way several people communicate is through speech. Speech is used to
convey other information such as speaker communication, emotion, and attitude. Therefore …

被引用次数：17 相关文章所有 4 个版本

Multi-layer attention mechanism based speech separation model

M Li, T Lan, C Peng, Y Qian… - 2019 IEEE 19th …, 2019 - ieeexplore.ieee.org

Speech separation is the front-end of speech processing applications. Its purpose is to
separate the speech in a multi-speaker environment. The neural network methods show …

被引用次数：4 相关文章

[PDF] uea.ac.uk

[PDF][PDF] Audio-visual speaker separation

F Khan - 2016 - ueaeprints.uea.ac.uk

Communication using speech is often an audio-visual experience. Listeners hear what is
being uttered by speakers and also see the corresponding facial movements and other …

被引用次数：4 相关文章所有 3 个版本

[PDF] sciencedirect.com

Implementation of real-time speech separation model using time-domain audio separation network (TasNet) and dual-path recurrent neural network (DPRNN)

A Wijayakusuma, DR Gozali, A Widjaja… - Procedia Computer …, 2021 - Elsevier

The purpose of this research is to develop a model that is able to perform real-time speaker
independent multi-talker speech separation task in time-domain using Time-Domain Audio …

被引用次数：7 相关文章所有 2 个版本

高级搜索

QQ 群