مرکز منطقه ای اطلاع رساني علوم و فناوري - Classification of voiceless speech using facial muscle activity and vision based techniques

DocumentCode :

2532178

Title :

Classification of voiceless speech using facial muscle activity and vision based techniques

Author :

Yau, Wai Chee ; Arjunan, Sridhar Poosapadi ; Kumar, Dinesh Kant

Author_Institution :

Sch. of Electr. & Comput. Eng., RMIT Univ., Melbourne, VIC

fYear :

2008

fDate :

19-21 Nov. 2008

Firstpage :

Lastpage :

Abstract :

This paper presents a silent speech recognition technique based on facial muscle activity and video, without evaluating any voice signals. This research examines the use of facial surface electromyogram (SEMG) to identify unvoiced vowels and vision-based technique to classify unvoiced consonants. The moving root mean square (RMS) of SEMG signals of four facial muscles is used to segment the signals and to identify the start and end of a silently spoken vowels. Visual features are extracted from the mouth video of a speaker silently uttering consonants using motion segmentation and image moment techniques. The SEMG features and visual features are classified using feedforward multilayer perceptron (MLP) neural networks. The preliminary results demonstrate that the proposed technique yields high recognition rate for classification of unvoiced vowels using SEMG features. Similarly, promising results are obtained in identification of consonants using visual features. The results demonstrate that the system is easy to train for a new user and suggest that such a system works reliably for voiceless, simple speech based commands for human computer interface when it is trained for a user.

Keywords :

electromyography; face recognition; feature extraction; human computer interaction; image classification; image motion analysis; mean square error methods; medical image processing; multilayer perceptrons; muscle; speech recognition; MLP neural networks; facial muscle activity; facial surface electromyogram; feature extraction; feedforward multilayer perceptron; human computer interface; root mean square; signal segmentation; speech recognition technique; vision based techniques; voiceless speech classification; Computer vision; Facial muscles; Feature extraction; Image segmentation; Motion segmentation; Mouth; Multilayer perceptrons; Root mean square; Signal processing; Speech recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

TENCON 2008 - 2008 IEEE Region 10 Conference

Conference_Location :

Hyderabad

Print_ISBN :

978-1-4244-2408-5

Electronic_ISBN :

978-1-4244-2409-2

Type :

conf

DOI :

10.1109/TENCON.2008.4766822

Filename :

4766822

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2532178