مرکز منطقه ای اطلاع رساني علوم و فناوري - Audio-Visual Speech Recognition with Weighted KNN-based Classification in Mandarin Database

DocumentCode :

3061438

Title :

Audio-Visual Speech Recognition with Weighted KNN-based Classification in Mandarin Database

Author :

Pao, Tsang-Long ; Liao, Wen-Yuan ; Chen, Yu-Te

Author_Institution :

Tatung Univ., Taipei

Volume :

fYear :

2007

fDate :

26-28 Nov. 2007

Firstpage :

Lastpage :

Abstract :

Automatic speech recognition (ASR) by machine has been a goal and an attractive research area for past several decades. In recent years, there has been growing attractive research topic for overcoming certain audio-only recognition problems. Motivated by the multimodal nature of speech, the visual feature is considered to bring in information that dose not existing in the acoustic signal and enables improved system performance over audio-only methods. We first introduce the method for the extraction for the visual feature of the lip. In this paper, we compare three different weighting functions in weighted KNN-based classifiers to recognize ten digits, including 0 to 9, from Mandarin audio-visual speech. The classifiers studied include traditional KNN, weighted KNN, and weighted D-KNN. We also create a new audio-visual database in English and Mandarin. We will test this database for our proposed system, with some experimental results.

Keywords :

audio-visual systems; natural languages; pattern classification; speech recognition; English language; Mandarin language; audio-visual speech recognition; automatic speech recognition; weighted KNN-based classification; Audio databases; Automatic speech recognition; Data mining; Feature extraction; Nearest neighbor searches; Pattern recognition; Spatial databases; Speech recognition; Training data; Visual databases;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Intelligent Information Hiding and Multimedia Signal Processing, 2007. IIHMSP 2007. Third International Conference on

Conference_Location :

Kaohsiung

Print_ISBN :

978-0-7695-2994-1

Type :

conf

DOI :

10.1109/IIHMSP.2007.4457488

Filename :

4457488

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3061438