Unix-encoder: A universal x-channel speech encoder for ad-hoc microphone array speech processing

文章

学术资源搜索

获得 4 条结果（用时0.02秒）

我的图书馆

Unix-encoder: A universal x-channel speech encoder for ad-hoc microphone array speech processing

在引用文章中搜索

[PDF] arxiv.org

A Large-Scale Evaluation of Speech Foundation Models

S Yang, HJ Chang, Z Huang, AT Liu… - … on Audio, Speech …, 2024 - ieeexplore.ieee.org

The foundation model paradigm leverages a shared foundation model to achieve state-of-
the-art (SOTA) performance for various tasks, requiring minimal downstream-specific data …

被引用次数：16 相关文章所有 7 个版本

[PDF] arxiv.org

M-best-rq: A multi-channel speech foundation model for smart glasses

Y Yang, D Raj, J Lin, N Moritz, J Jia, G Keren… - arXiv preprint arXiv …, 2024 - arxiv.org

The growing popularity of multi-channel wearable devices, such as smart glasses, has led to
a surge of applications such as targeted speech recognition and enhanced hearing …

被引用次数：1 相关文章所有 2 个版本

Spatialemb: Extract and Encode Spatial Information for 1-Stage Multi-Channel Multi-Speaker ASR on Arbitrary Microphone Arrays

Y Shao, Y Xu, S Khudanpur… - 2024 IEEE Spoken …, 2024 - ieeexplore.ieee.org

Spatial information is a critical clue for multi-channel multispeaker target speech recognition.
Most state-of-the-art multi-channel Automatic Speech Recognition (ASR) systems extract …

[PDF] arxiv.org

Multi-Channel Multi-Speaker ASR Using Target Speaker's Solo Segment

Y Shao, SX Zhang, Y Xu, M Yu, D Yu, D Povey… - arXiv preprint arXiv …, 2024 - arxiv.org

In the field of multi-channel, multi-speaker Automatic Speech Recognition (ASR), the task of
discerning and accurately transcribing a target speaker's speech within background noise …

被引用次数：1 相关文章

高级搜索

QQ 群

Unix-encoder: A universal x-channel speech encoder for ad-hoc microphone array speech processing

A Large-Scale Evaluation of Speech Foundation Models

M-best-rq: A multi-channel speech foundation model for smart glasses

Spatialemb: Extract and Encode Spatial Information for 1-Stage Multi-Channel Multi-Speaker ASR on Arbitrary Microphone Arrays

Multi-Channel Multi-Speaker ASR Using Target Speaker's Solo Segment

引用