DocumentCode
144920
Title
An extension to Fisher Linear Semi-Discriminant analysis for Speaker Diarization
Author
Montazzolli, Sergio ; Adami, Andrea ; Barone, Dante
Author_Institution
Inst. de Inf./PPGC, UFRGS, Porto Alegre, Brazil
fYear
2014
fDate
17-20 Aug. 2014
Firstpage
1
Lastpage
5
Abstract
The Fisher Linear Semi-Discriminant Analysis is used in Speaker Diarization to project acoustic features into a discriminant and lower dimensional space. Given that such analysis uses short segments to estimate the scatter matrices, the projection could be improved by using longer segments (i.e., more information). Since a change of speaker is more likely to occur during periods of non-speech, we propose to use segments of speech produced by the boundaries estimated from a voice activity detection method based on Hidden Markov Models. Using datasets from the NIST Speaker Recognition Evaluations, we show that the estimated segments provide a better scatter matrices for the analysis. The results show a relative improvement of 21% in the Speaker Error Time on the Switchboard corpus used in the evaluations.
Keywords
S-matrix theory; estimation theory; hidden Markov models; speaker recognition; Hidden Markov models; acoustic features projection; fisher linear semidiscriminant analysis; scatter matrices; speaker diarization; speaker error time; speech segments; voice activity detection method; Hidden Markov models; Mel frequency cepstral coefficient; Nickel; Speech; Speech processing; Vectors;
fLanguage
English
Publisher
ieee
Conference_Titel
Telecommunications Symposium (ITS), 2014 International
Conference_Location
Sao Paulo
Type
conf
DOI
10.1109/ITS.2014.6947969
Filename
6947969
Link To Document