Title :
An audio classification and speech recognition system for video content analysis
Author :
Feng, Huamin ; Jiang, Chao ; Yang, Xinghua
Author_Institution :
Beijing Electron. Sci. & Technol. Inst., Beijing, China
Abstract :
Audio can provide useful information for video content analysis. Audio classification and speech recognition for video content analysis is proposed in this paper. Firstly, audio data from video stream is extracted. Secondly, the audio frames are classified into silence, speech and music based on rules and Support Vector Machine(SVM) algorithm. Finally, an automatic speech recognition(ASR) system is applied for speech-to-text conversion. The experimental result on CCTV_NEWS of TRECVID shows that our approach is effective.
Keywords :
audio signal processing; feature extraction; signal classification; speech recognition; support vector machines; video streaming; CCTV_NEWS; SVM algorithm; TRECVID; audio classification; automatic speech recognition system; feature extraction; speech-to-text conversion; support vector machine; video content analysis; video stream; Accuracy; Feature extraction; Mel frequency cepstral coefficient; Speech; Speech recognition; Streaming media; Support vector machines; audio classification and segmentation; speech recognition; video content analysis;
Conference_Titel :
Multimedia Technology (ICMT), 2011 International Conference on
Conference_Location :
Hangzhou
Print_ISBN :
978-1-61284-771-9
DOI :
10.1109/ICMT.2011.6002093