[PDF][PDF] Multi-scale group transformer for long sequence modeling in speech separation

Y Zhao, C Luo, ZJ Zha, W Zeng - Proceedings of the Twenty-Ninth …, 2021 - ijcai.org
… inapplicable to speech applications. To tackle this issue, we propose a novel variation of
Transformer, named multi-scale group Transformer (MSGT). The key ideas are group self-…

TransMask: A compact and fast speech separation model based on transformer

Z Zhang, B He, Z Zhang - … Conference on Acoustics, Speech …, 2021 - ieeexplore.ieee.org
… high separation quality, we propose a new transformer-based speech separation approach,
… The overall architecture of TransMask: Transformer is run over a group of RNNs in order to …

Exploring self-attention mechanisms for speech separation

C Subakan, M Ravanelli, S Cornell… - … on Audio, Speech …, 2023 - ieeexplore.ieee.org
… Abstract—Transformers have enabled impressive improvements in … in speech separation
with the WSJ0-2/3 Mix datasets. This paper studies in-depth Transformers for speech separation

Resource-efficient separation transformer

C Subakan, M Ravanelli, S Cornell, F Lepoutre… - arXiv preprint arXiv …, 2022 - arxiv.org
… This paper explores Transformer-based speech separation with … a novel small-footprint speech
separation model built upon the … group transformer for long sequence modeling in speech

Mossformer: Pushing the performance limit of monaural speech separation using gated single-head transformer with convolution-augmented joint self-attentions

S Zhao, B Ma - … International Conference on Acoustics, Speech …, 2023 - ieeexplore.ieee.org
… by building on standard Transformer with multi-head self-… Transformer models to learn local
feature patterns. In this work, we propose a novel Monaural speech separation TransFormer (…

Continuous speech separation with conformer

S Chen, Y Wu, Z Chen, J Wu, J Li… - … Acoustics, Speech …, 2021 - ieeexplore.ieee.org
Transformer based speech separation architecture was proposed in [12], achieving the state
of the art separation … It was also reported in [15] that incorporating Transformer into an end-to…

Ultra fast speech separation model with teacher student learning

S Chen, Y Wu, Z Chen, J Wu, T Yoshioka, S Liu… - arXiv preprint arXiv …, 2022 - arxiv.org
… In this paper, an ultra fast speech separation Transformerspeech separation on LibriCSS
dataset. Utilizing more unlabeled … to recover the clean speech, where a group of masks M(t, f)=[…

Semantic Grouping Network for Audio Source Separation

S Mo, Y Tian - arXiv preprint arXiv:2407.03736, 2024 - arxiv.org
… Then we adopt 6 selfattention transformer layers to extract … In addition, they leveraged
multiple grouping stages during … speech separation,” arXiv preprint arXiv:1804.03619, 2018. 1…

Mossformer2: Combining transformer and rnn-free recurrent network for enhanced time-domain monaural speech separation

S Zhao, Y Ma, C Ni, C Zhang, H Wang… - … Acoustics, Speech …, 2024 - ieeexplore.ieee.org
Our previously proposed MossFormer has achieved promising performance in monaural
speech separation. However, it predominantly adopts a self-attention-based MossFormer …

Resource-Efficient Separation Transformer

L Della Libera, C Subakan, M Ravanelli… - … Acoustics, Speech …, 2024 - ieeexplore.ieee.org
speech separationTransformer-based speech separation with a reduced computational
cost. Our main contribution is the development of the Resource-Efficient Separation Transformer