Title :
Keyword-specific normalization based keyword spotting for spontaneous speech
Author :
Weifeng Li ; Qingmin Liao
Author_Institution :
Dept. of Electron. Eng., Tsinghua Univ., Shenzhen, China
Abstract :
This paper presents a novel architecture for keyword spotting in spontaneous speech, in which keyword model is trained from a small number of acoustic examples provided by a user. The word-spotting architecture relies on scoring patch feature vector sequences extracted by using sliding windows, and performing keyword-specific normalization and threshold setting. Dynamic time warping (DTW) based template matching and Gaussian Mixture Models (GMM) are proposed to model the keyword, and another GMM is proposed to model the non-keywords. Our keyword spotting experiments demonstrate the effectiveness of the proposed methods. More specifically, the proposed GMM log-likelihood ratio based method achieves about 17% absolute improvement in terms of recall rates compared to the baseline system.
Keywords :
Bayes methods; Gaussian processes; feature extraction; hidden Markov models; pattern matching; speech processing; speech recognition; Bayesian information criterion; DTW; GMM log-likelihood ratio based method; Gaussian mixture models; dynamic time warping based template matching; keyword model; keyword-specific normalization based keyword spotting; phonetic hidden Markov model; scoring patch feature vector sequence extraction; sliding windows; speech utterance; spontaneous speech; threshold setting; word-spotting architecture; Acoustics; Data models; Hidden Markov models; Speech; Training; Training data; Vectors; Bayesian Information Criterion; Gaussian mixture model; Keyword spotting; dynamic time warping; sliding window;
Conference_Titel :
Chinese Spoken Language Processing (ISCSLP), 2012 8th International Symposium on
Conference_Location :
Kowloon
Print_ISBN :
978-1-4673-2506-6
Electronic_ISBN :
978-1-4673-2505-9
DOI :
10.1109/ISCSLP.2012.6423490