DocumentCode :
3061438
Title :
Audio-Visual Speech Recognition with Weighted KNN-based Classification in Mandarin Database
Author :
Pao, Tsang-Long ; Liao, Wen-Yuan ; Chen, Yu-Te
Author_Institution :
Tatung Univ., Taipei
Volume :
1
fYear :
2007
fDate :
26-28 Nov. 2007
Firstpage :
39
Lastpage :
42
Abstract :
Automatic speech recognition (ASR) by machine has been a goal and an attractive research area for past several decades. In recent years, there has been growing attractive research topic for overcoming certain audio-only recognition problems. Motivated by the multimodal nature of speech, the visual feature is considered to bring in information that dose not existing in the acoustic signal and enables improved system performance over audio-only methods. We first introduce the method for the extraction for the visual feature of the lip. In this paper, we compare three different weighting functions in weighted KNN-based classifiers to recognize ten digits, including 0 to 9, from Mandarin audio-visual speech. The classifiers studied include traditional KNN, weighted KNN, and weighted D-KNN. We also create a new audio-visual database in English and Mandarin. We will test this database for our proposed system, with some experimental results.
Keywords :
audio-visual systems; natural languages; pattern classification; speech recognition; English language; Mandarin language; audio-visual speech recognition; automatic speech recognition; weighted KNN-based classification; Audio databases; Automatic speech recognition; Data mining; Feature extraction; Nearest neighbor searches; Pattern recognition; Spatial databases; Speech recognition; Training data; Visual databases;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Information Hiding and Multimedia Signal Processing, 2007. IIHMSP 2007. Third International Conference on
Conference_Location :
Kaohsiung
Print_ISBN :
978-0-7695-2994-1
Type :
conf
DOI :
10.1109/IIHMSP.2007.4457488
Filename :
4457488
Link To Document :
بازگشت