DocumentCode :
1535700
Title :
Statistical Utterance Comparison for Speaker Clustering Using Factor Analysis
Author :
Jeon, Woojay ; Ma, Changxue ; Macho, Dusan
Author_Institution :
Samsung Electron., Suwon, South Korea
Volume :
20
Issue :
9
fYear :
2012
Firstpage :
2482
Lastpage :
2491
Abstract :
We propose a novel method of measuring the similarity between two or more speech utterances for speaker clustering, based on probability theory and factor analysis. The similarity function is formulated as the probability that the utterances originated from the same speaker, and uses statistical eigenvoice and eigenchannel models to incorporate physical knowledge of interspeaker and intraspeaker variabilities, allowing the similarity function to be trainable and robust. The comparison function can be efficiently computed using a compact set of sufficient statistics for each speech utterance, allowing the acoustic features to be discarded. We begin using only eigenvoices, and then show how the eigenchannels can be incorporated into the equation to result in an identical form but with a different set of sufficient statistics. We test the proposed model in a speaker clustering task using the CALLHOME telephone conversation corpus and show that it performs better than two other well-known similarity measures: the Cross-Likelihood Ratio (CLR) and Generalized Likelihood Ratio (GLR).
Keywords :
eigenvalues and eigenfunctions; pattern clustering; probability; speech processing; CALLHOME telephone conversation corpus; CLR; GLR; cross-likelihood ratio; eigenchannel models; factor analysis; generalized likelihood ratio; interspeaker variability; intraspeaker variability; probability theory; similarity function; speaker clustering; speech utterances; statistical eigenvoice; statistical utterance; Closed-form solutions; Covariance matrix; Equations; Mathematical model; Speech; TV; Vectors; Factor analysis; speaker clustering; speaker diarization; utterance comparison;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2012.2204050
Filename :
6214577
Link To Document :
بازگشت