Title :
New attempts in sound diarization
Author :
Costin, Ciprian ; Costin, Mihaela
Author_Institution :
Dept. of Comput. Sci., Al. I. Cuza Univ., Iasi, Romania
fDate :
July 29 2009-Aug. 1 2009
Abstract :
The paper discusses a new hybrid method in sound diarization (the process of segmenting an audio file into chunks that represent unique sources and clustering the obtained segments into groups that represent the same item). The most recent results are focusing mainly on the identification of voices during the telephonic recordings. In the hybrid method proposed here, a clustering is applied first, using an agglomerative approach regarding the construction of speaker models. Subsequently, when consistent amounts of data are gathered, special models are built using speaker factors. This idea gives good performance over the classical approach as the low-level clustering Bayesian Information Criterion scheme has poor performance on complex models, where speaker factors have very good precision. Speaker diarization improves speaker verification for multi-speaker audio (summed channel telephone data, single microphone interview data), is very important for speech recognition, and improves readability of an automatic transcription by structuring the audio stream into speaker turns and in some cases by providing the identity of the speakers. Sound diarization offers information which can be of interest for the multimedia documents indexing, in human-computer interaction, robotics, security systems, etc.
Keywords :
pattern clustering; speaker recognition; agglomerative approach; multimedia document indexing; sound diarization; speaker diarization; speaker verification; speech recognition; telephonic recording; voice identification; Audio recording; Bayesian methods; Human robot interaction; Indexing; Loudspeakers; Microphones; Multimedia systems; Speech recognition; Streaming media; Telephony;
Conference_Titel :
Soft Computing Applications, 2009. SOFA '09. 3rd International Workshop on
Conference_Location :
Arad
Print_ISBN :
978-1-4244-5054-1
Electronic_ISBN :
978-1-4244-5056-5
DOI :
10.1109/SOFA.2009.5254874