作者
Pejman Mowlaee, Rahim Saeidi, Mads Græsbøll Christensen, Zheng-Hua Tan, Tomi Kinnunen, Pasi Franti, Søren Holdt Jensen
发表日期
2012/7/13
期刊
IEEE Transactions on Audio, Speech, and Language Processing
卷号
20
期号
9
页码范围
2586-2601
出版商
IEEE
简介
In this paper, we present a novel system for joint speaker identification and speech separation. For speaker identification a single-channel speaker identification algorithm is proposed which provides an estimate of signal-to-signal ratio (SSR) as a by-product. For speech separation, we propose a sinusoidal model-based algorithm. The speech separation algorithm consists of a double-talk/single-talk detector followed by a minimum mean square error estimator of sinusoidal parameters for finding optimal codevectors from pre-trained speaker codebooks. In evaluating the proposed system, we start from a situation where we have prior information of codebook indices, speaker identities and SSR-level, and then, by relaxing these assumptions one by one, we demonstrate the efficiency of the proposed fully blind system. In contrast to previous studies that mostly focus on automatic speech recognition (ASR) accuracy …
引用总数
20122013201420152016201720182019202020212022202320241477466433412
学术搜索中的文章
P Mowlaee, R Saeidi, MG Christensen, ZH Tan… - IEEE Transactions on Audio, Speech, and Language …, 2012