UBM-based real-time speaker segmentation for broadcasting news

Author

Wu, TingYao ; Lu, Lie ; Chen, Ke ; Zhang, Hong-Jiang

Author_Institution

Peking Univ., Beijing, China

Volume

2

fYear

2003

fDate

6-10 April 2003

Abstract

This paper addresses the problem of real-time speaker change detection in broadcast news, in which no prior knowledge on speakers is assumed. Our speaker segmentation is a "coarse to refine" process, which consists of two stages: pre-segmentation and refinement. In the pre-segmentation stage, a new approach based on Gaussian mixture model-universal background model (GMM-UBM) is proposed to categorize feature vectors into three sets, i.e. reliable speaker-related set, doubtful speaker-related set and unreliable speaker-related set, in order to enhance the effect of the reliable speaker-related feature vectors. Then potential speaker change boundaries are detected based on a novel distance measure. In the refinement stage, incremental speaker adaptation (ISA), which is suitable for real-time requirement, is proposed to obtain considerably precise speaker models so that the potential speaker change boundaries can be confirmed and refined. Experimental results demonstrate that our approach yields satisfactory performance.

Keywords

Gaussian distribution; broadcasting; feature extraction; speaker recognition; GMM-UBM; Gaussian mixture model-universal background model; broadcast news; change boundary detection; coarse to refine process; distance measure; doubtful speaker-related set; feature vectors; incremental speaker adaptation; performance; pre-segmentation; real-time speaker segmentation; refinement; reliable speaker-related set; speaker change detection; unreliable speaker-related set; Asia; Broadcasting; Costs; Indexing; Instruction sets; Iterative algorithms; Iterative methods; Real time systems; Speech; Streaming media;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on

ISSN

1520-6149

Print_ISBN

0-7803-7663-3

Type

conf

DOI

10.1109/ICASSP.2003.1202327

Filename

1202327