作者
Soonil Kwon, Shrikanth S Narayanan
发表日期
2002/9/16
研讨会论文
INTERSPEECH
页码范围
2537-2540
简介
Speaker change detection is a key pre-requisite to speaker tracking and speaker adaptation. It detects the points where a speaker identity changes in a multi-speaker audio stream. We first extract the speech segments from an audio stream by segmentation and classification techniques. Using the extracted speech segments, the proposed weighted metric-based technique detects the speaker change points. New weights are originated from Fisher Linear Discriminant Analysis and, when used with Mel Cepstrum feature vectors, it has an effect of subband processing. Experiments were performed with HUB-4 Broadcast News Evaluation English Test Material (1999) and a movie audio track. Results showed that our technique gave about 37.7% improvement compared with Euclidean distance on the broadcast news data and about 27.1% on the movie data; with Mahalanobis distance, the improvements were 37.7% and 25.3% for broadcast news and movie data, respectively.
引用总数
20032004200520062007200820092010201120122013201420152016201720182019202020212022331333344851131111