مرکز منطقه ای اطلاع رساني علوم و فناوري - A voice trigger system using keyword and speaker recognition for mobile devices

DocumentCode :

1377064

Title :

A voice trigger system using keyword and speaker recognition for mobile devices

Author :

Lee, Hyeopwoo ; Chang, Sukmoon ; Yook, Dongsuk ; Kim, Yongserk

Author_Institution :

Dept. of Comput. & Commun. Eng., Korea Univ., Seoul, South Korea

Volume :

Issue :

fYear :

2009

fDate :

11/1/2009 12:00:00 AM

Firstpage :

2377

Lastpage :

2384

Abstract :

Voice activity detection plays an important role for an efficient voice interface between human and mobile devices, since it can be used as a trigger to activate an automatic speech recognition module of a mobile device. If the input speech signal can be recognized as a predefined magic word coming from a legitimate user, it can be utilized as a trigger. In this paper, we propose a voice trigger system using a keyword-dependent speaker recognition technique. The voice trigger must be able to perform keyword recognition, as well as speaker recognition, without using computationally demanding speech recognizers to properly trigger a mobile device with low computational power consumption. We propose a template based method and a hidden Markov model (HMM) based method for the voice trigger to solve this problem. The experiments using a Korean word corpus show that the template based method performed 4.1 times faster than the HMM based method. However, the HMM based method reduced the recognition error by 27.8% relatively compared to the template based method. The proposed methods are complementary and can be used selectively depending on the device of interest.

Keywords :

hidden Markov models; mobile handsets; natural language processing; quantisation (signal); speaker recognition; Korean word corpus; automatic speech recognition; dynamic time warping; hidden Markov model; keyword recognition; mobile devices; speaker recognition; vector quantization; voice activity detection; voice trigger system; Automatic speech recognition; Hidden Markov models; Humans; Mobile communication; Power supplies; Signal detection; Signal processing; Speaker recognition; Speech processing; Speech recognition; Voice trigger, keyword recognition, speaker recognition, dynamic time warping, vector quantization, Gaussian mixture model, hidden Markov model;

fLanguage :

English

Journal_Title :

Consumer Electronics, IEEE Transactions on

Publisher :

ieee

ISSN :

0098-3063

Type :

jour

DOI :

10.1109/TCE.2009.5373813

Filename :

5373813

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1377064