DocumentCode :
3442423
Title :
Unsupervised speaker indexing using speaker model selection based on Bayesian information criterion
Author :
Nishida, Masanori ; Kawahara, Tatsuya
Author_Institution :
PRESTO, Japan Sci. & Technol. Corp., Kyoto, Japan
Volume :
1
fYear :
2003
fDate :
6-10 April 2003
Abstract :
The paper addresses unsupervised speaker indexing for discussion audio archives. In discussions, the speaker changes frequently, thus the duration of utterances is very short and its variation is large, which causes significant problems in applying conventional methods such as model adaptation and variance-BIC (Bayesian information criterion) methods. We propose a flexible framework that selects an optimal speaker model (GMM or VQ) based on the BIC according to the duration of utterances. When the speech segment is short, the simple and robust VQ-based method is expected to be chosen, while GMM can be reliably trained for long segments. For a discussion archive having a total duration of 10 hours, it is demonstrated that the proposed method achieves higher indexing performance than that of conventional methods.
Keywords :
Bayes methods; Markov processes; signal classification; speaker recognition; speech processing; unsupervised learning; vector quantisation; GMM speaker model; VQ speaker model; automatic speech recognition; discussion audio archives; speaker model selection; speaker recognition; speech segment; unsupervised speaker indexing; utterance duration; variance Bayesian information criterion; Adaptation model; Bayesian methods; Broadcasting; Gaussian distribution; Indexing; Informatics; Loudspeakers; Speech; Testing; Voice mail;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
ISSN :
1520-6149
Print_ISBN :
0-7803-7663-3
Type :
conf
DOI :
10.1109/ICASSP.2003.1198744
Filename :
1198744
Link To Document :
بازگشت