DocumentCode :
2173446
Title :
Speaker diarization of meetings based on speaker role n-gram models
Author :
Valente, Fabio ; Vijayasenan, Deepu ; Motlicek, Petr
Author_Institution :
Idiap Res. Inst., Martigny, Switzerland
fYear :
2011
fDate :
22-27 May 2011
Firstpage :
4416
Lastpage :
4419
Abstract :
Speaker diarization of meeting recordings is generally based on acoustic information ignoring that meetings are instances of conversations. Several recent works have shown that the sequence of speakers in a conversation and their roles are related and statistically predictable. This paper proposes the use of speaker roles n-gram model to capture the conversation patterns probability and investigates its use as prior information into a state-of-the-art diarization system. Experiments are run on the AMI corpus annotated in terms of roles. The proposed technique reduces the diarization speaker error by 19% when the roles are known and by 17% when they are estimated. Furthermore the paper investigates how the n-gram models generalize to different settings like those from the Rich Transcription campaigns. Experiments on 17 meetings reveal that the speaker error can be reduced by 12% also in this case thus the n-gram can generalize across corpora.
Keywords :
speaker recognition; rich transcription campaigns; speaker diarization; speaker error; speaker role n-gram models; state-of-the-art diarization system; Data models; Decoding; Hidden Markov models; Mel frequency cepstral coefficient; Speech; Viterbi algorithm; Speaker Roles; Speaker diarization; Viterbi decoding; meeting recordings; multi-party conversations;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location :
Prague
ISSN :
1520-6149
Print_ISBN :
978-1-4577-0538-0
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2011.5947333
Filename :
5947333
Link To Document :
بازگشت