DocumentCode :
2363019
Title :
From science fiction to science fact: A Smart-House interface using speech technology and a photo-realistic avatar
Author :
Moir, T.J. ; Filho, G.L.
Author_Institution :
Sch. of Eng. & Adv. Technol., Massey Univ., Auckland
fYear :
2008
fDate :
2-4 Dec. 2008
Firstpage :
327
Lastpage :
333
Abstract :
This paper explores the problems of speech recognition in a (sometimes) noisy environment. An adaptive acoustic beamformer is proposed based on the Griffiths-Jim method and a "hot-spot" where speech can be received within a geometric defined boundary and rejected outside of it will be shown to give a certain amount of noise immunity and improve the signal-to-noise ratio for the second stage, which is the speech recognition engine. The recognition engine used has a limited vocabulary which gives rise to an excellent hit-rate and less training than unlimited vocabulary. Limited vocabulary is sufficient for a good many applications where devices are switched in a Boolean form for lighting, TV, radio etc. In addition to the speech recognition, good quality speech synthesis is also necessary to feedback information about the house to the end-user. The technology here has improved vastly within the last decade and will be shown that by using a head and shoulders avatar that is both photo-realistic and with appealing personality, that the experience of a speech interface is vastly enhanced. The paper explores these technologies and investigate the convergence of many of them in the current Massey smart-office.
Keywords :
avatars; home computing; speech recognition; speech synthesis; user interfaces; vocabulary; Boolean form; Griffiths-Jim method; Massey smart-office; adaptive acoustic beamformer; geometric defined boundary; head avatar; hot-spot; noise immunity; photo-realistic avatar; shoulders avatar; signal-to-noise ratio; smart-house interface; speech recognition engine; speech synthesis; vocabulary; Acoustic noise; Avatars; Engines; Signal to noise ratio; Speech enhancement; Speech recognition; Speech synthesis; TV; Vocabulary; Working environment noise;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Mechatronics and Machine Vision in Practice, 2008. M2VIP 2008. 15th International Conference on
Conference_Location :
Auckland
Print_ISBN :
978-1-4244-3779-5
Electronic_ISBN :
978-0-473-13532-4
Type :
conf
DOI :
10.1109/MMVIP.2008.4749555
Filename :
4749555
Link To Document :
بازگشت