DocumentCode :
3165414
Title :
Determining the number of speakers in a meeting using microphone array features
Author :
Zwyssig, Erich ; Renals, Steve ; Lincoln, Mike
Author_Institution :
Centre for Speech Technol. Res., Univ. of Edinburgh, Edinburgh, UK
fYear :
2012
fDate :
25-30 March 2012
Firstpage :
4765
Lastpage :
4768
Abstract :
The accuracy of speaker diarisation in meetings relies heavily on determining the correct number of speakers. In this paper we present a novel algorithm based on time difference of arrival (TDOA) features that aims to find the correct number of active speakers in a meeting and thus aid the speaker segmentation and clustering process. With our proposed method the microphone array TDOA values and known geometry of the array are used to calculate a speaker matrix from which we determine the correct number of active speakers with the aid of the Bayesian information criterion (BIC). In addition, we analyse several well-known voice activity detection (VAD) algorithms and verified their fitness for meeting recordings. Experiments were performed using the NIST RT06, RT07 and RT09 data sets, and resulted in reduced error rates compared with BIC-based approaches.
Keywords :
Bayes methods; geometry; matrix algebra; microphone arrays; pattern clustering; speaker recognition; time-of-arrival estimation; BIC-based approaches; Bayesian information criterion; NIST RT06 data sets; RT07 data sets; RT09 data sets; VAD algorithms; active speakers; clustering process; geometry; meeting recordings; microphone array TDOA values; reduced error rates; speaker diarisation; speaker matrix; speaker segmentation; time difference of arrival features; voice activity detection algorithms; Arrays; Bayesian methods; Density estimation robust algorithm; Error analysis; Microphones; NIST; Speech; BIC; Speaker diarisation in meetings; microphone array; speech segmentation and clustering; time difference of arrival (TDOA); voice activity detection (VAD);
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location :
Kyoto
ISSN :
1520-6149
Print_ISBN :
978-1-4673-0045-2
Electronic_ISBN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2012.6288984
Filename :
6288984
Link To Document :
بازگشت