DocumentCode :
3135331
Title :
Wake-up-word detection by estimating formants from spatial eigenspace information
Author :
Hu, Jwu-Sheng ; Lee, Ming-Tang ; Xiao, Yun-Xuan
Author_Institution :
Nat. Chiao Tung Univ., Hshinchu, Taiwan
fYear :
2012
fDate :
5-8 Aug. 2012
Firstpage :
2019
Lastpage :
2024
Abstract :
Wake-up-word (WUW) detection is to detect a single word or phrase while rejecting all other words or sounds. For distant human-robot interaction (HRI), the location of the target speaker and a unique command are required to activate the robot. In this paper, a multi-channel speech interface is introduced not only to estimate the unknown locations of the sound sources but also to strengthen the speech feature for WUW detection. A ring-shape microphone array is used to collect the speech signal. The spatial eigenspace information discovered by multiple signal classification (MUSIC) is used to estimate location dependent formants and the direction of the target speaker. The estimated formants contained in fixed time duration are grouped and evaluated using the likelihood functions of formants. A cascaded detector is also introduced to make the final decision. Experimental results demonstrate the usefulness of the proposed approach with several noisy conditions, including the cases of simultaneous competing speeches.
Keywords :
eigenvalues and eigenfunctions; estimation theory; human computer interaction; microphone arrays; signal classification; speech processing; speech-based user interfaces; HRI; MUSIC; WUW detection; cascaded detector; estimated formants; human-robot interaction; likelihood functions; location dependent formants estimation; multichannel speech interface; multiple signal classification; noisy conditions; phrase detection; ring-shape microphone array; simultaneous competing speeches; single word detection; sound sources; spatial eigenspace information; speech feature; speech signal; target speaker direction; unknown location estimation; wake-up-word detection; Arrays; Detectors; Direction of arrival estimation; Estimation; Multiple signal classification; Signal to noise ratio; Speech;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Mechatronics and Automation (ICMA), 2012 International Conference on
Conference_Location :
Chengdu
Print_ISBN :
978-1-4673-1275-2
Type :
conf
DOI :
10.1109/ICMA.2012.6285132
Filename :
6285132
Link To Document :
بازگشت