مرکز منطقه ای اطلاع رساني علوم و فناوري - Automatic Speaker Clustering Using a Voice Characteristic Reference Space and Maximum Purity Estimation

DocumentCode :

780194

Title :

Automatic Speaker Clustering Using a Voice Characteristic Reference Space and Maximum Purity Estimation

Author :

Tsai, Wei-Ho ; Cheng, Shih-Sian ; Wang, Hsin-Min

Author_Institution :

Dept. of Electron. Eng., Nat. Taipei Univ. of Technol.

Volume :

Issue :

fYear :

2007

fDate :

5/1/2007 12:00:00 AM

Firstpage :

1461

Lastpage :

1474

Abstract :

This paper investigates the problem of automatically grouping unknown speech utterances based on their associated speakers. In attempts to determine which utterances should be grouped together, it is necessary to measure the voice similarities between utterances. Since most existing methods measure the inter-utterance similarities based directly on the spectrum-based features, the resulting clusters may not be well-related to speakers, but to various acoustic classes instead. This study remedies this shortcoming by projecting utterances onto a reference space trained to cover the generic voice characteristics underlying the whole utterance collection. The resultant projection vectors naturally reflect the relationships of voice similarities among all the utterances, and hence are more robust against interference from nonspeaker factors. Then, a clustering method based on maximum purity estimation is proposed, with the aim of maximizing the similarities between utterances within all the clusters. This method employs a genetic algorithm to determine the cluster to which each utterance should be assigned, which overcomes the limitation of conventional hierarchical clustering that the final result can only reach the local optimum. In addition, the proposed clustering method adapts a Bayesian information criterion to determine how many clusters should be created

Keywords :

Bayes methods; genetic algorithms; speaker recognition; Bayesian information criterion; automatic speaker clustering; genetic algorithm; maximum purity estimation; nonspeaker factors; spectrum-based features; speech utterances; voice characteristic reference space; Acoustic measurements; Clustering methods; Genetic algorithms; Indexing; Interference; Loudspeakers; Robustness; Speaker recognition; Speech processing; Streaming media; Genetic algorithm; maximum purity estimation; speaker clustering;

fLanguage :

English

Journal_Title :

Audio, Speech, and Language Processing, IEEE Transactions on

Publisher :

ieee

ISSN :

1558-7916

Type :

jour

DOI :

10.1109/TASL.2007.894525

Filename :

4156220

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=780194