DocumentCode :
2409616
Title :
Multimedia Content Segmentation Based on Speaker Recognition
Author :
Babu, Jasine ; Pathari, Vinod
Author_Institution :
Motorola India Pvt. Ltd., Bangalore
fYear :
2007
fDate :
22-24 Feb. 2007
Firstpage :
16
Lastpage :
19
Abstract :
Many recent works attempt to index multimedia data based on characteristics such as speaker identity and emotional content. In this work, speaker segmentation is performed on movies to extract the shots in which the target actor is speaking. A case of speaker identification on conversational speech under noisy conditions-this work is organized into two phases; an audio classification phase, for the removal of non-speech content, followed by a speaker recognition phase. Along with the speaker models, Gaussian mixture models are constructed for sound effects like fight sequences and drum beats to refine the removal of non-speech sounds. Results prove the effectiveness of this deviation from the conventional methods
Keywords :
Gaussian processes; audio signal processing; multimedia communication; speaker recognition; Gaussian mixture model; multimedia content segmentation; speaker recognition; Acoustic noise; Data mining; Face detection; Indexing; Information retrieval; Loudspeakers; Motion pictures; Multimedia databases; Speaker recognition; Speech processing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing, Communications and Networking, 2007. ICSCN '07. International Conference on
Conference_Location :
Chennai
Print_ISBN :
1-4244-0997-7
Electronic_ISBN :
1-4244-0997-7
Type :
conf
DOI :
10.1109/ICSCN.2007.350672
Filename :
4156575
Link To Document :
بازگشت