DocumentCode :
1377064
Title :
A voice trigger system using keyword and speaker recognition for mobile devices
Author :
Lee, Hyeopwoo ; Chang, Sukmoon ; Yook, Dongsuk ; Kim, Yongserk
Author_Institution :
Dept. of Comput. & Commun. Eng., Korea Univ., Seoul, South Korea
Volume :
55
Issue :
4
fYear :
2009
fDate :
11/1/2009 12:00:00 AM
Firstpage :
2377
Lastpage :
2384
Abstract :
Voice activity detection plays an important role for an efficient voice interface between human and mobile devices, since it can be used as a trigger to activate an automatic speech recognition module of a mobile device. If the input speech signal can be recognized as a predefined magic word coming from a legitimate user, it can be utilized as a trigger. In this paper, we propose a voice trigger system using a keyword-dependent speaker recognition technique. The voice trigger must be able to perform keyword recognition, as well as speaker recognition, without using computationally demanding speech recognizers to properly trigger a mobile device with low computational power consumption. We propose a template based method and a hidden Markov model (HMM) based method for the voice trigger to solve this problem. The experiments using a Korean word corpus show that the template based method performed 4.1 times faster than the HMM based method. However, the HMM based method reduced the recognition error by 27.8% relatively compared to the template based method. The proposed methods are complementary and can be used selectively depending on the device of interest.
Keywords :
hidden Markov models; mobile handsets; natural language processing; quantisation (signal); speaker recognition; Korean word corpus; automatic speech recognition; dynamic time warping; hidden Markov model; keyword recognition; mobile devices; speaker recognition; vector quantization; voice activity detection; voice trigger system; Automatic speech recognition; Hidden Markov models; Humans; Mobile communication; Power supplies; Signal detection; Signal processing; Speaker recognition; Speech processing; Speech recognition; Voice trigger, keyword recognition, speaker recognition, dynamic time warping, vector quantization, Gaussian mixture model, hidden Markov model;
fLanguage :
English
Journal_Title :
Consumer Electronics, IEEE Transactions on
Publisher :
ieee
ISSN :
0098-3063
Type :
jour
DOI :
10.1109/TCE.2009.5373813
Filename :
5373813
Link To Document :
بازگشت