DocumentCode
3244237
Title
Accurate hidden Markov models for non-audible murmur (NAM) recognition based on iterative supervised adaptation
Author
Heracleous, Panikos ; Nakajima, Yoshiki ; Lee, Akinobu ; Saruwatari, Hiroshi ; Shikano, Kiyohiro
Author_Institution
Graduate Sch. of Inf. Sci., Nara Inst. of Sci. & Technol., Japan
fYear
2003
fDate
30 Nov.-3 Dec. 2003
Firstpage
73
Lastpage
76
Abstract
In previous works, we introduced a special device (Non-Audible Murmur (NATM) microphone) able to detect very quietly uttered speech (murmur), which cannot be heard by listeners near the talker. Experimental results showed the efficiency of the device in NAM recognition. Using normal-speech monophone hidden Markov models (HMM) retrained with NAM data from a specific speaker, we could recognize NAM with high accuracy. Although the results were very promising, a serious problem is the HMM retraining, which requires a large amount of training data. In this paper, we introduce a new method for NAM recognition, which requires only a small amount of NAM data for training. The proposed method is based on supervised adaptation. The main difference from other adaptation approaches lies in the fact that instead of single-iteration adaptation, we use iterative adaptation (iterative supervised MLLR). Experiments prove the efficiency of the proposed method. Using normal-speech clean initial models and only 350 adaptation NAM utterances, we achieved a recognition accuracy of 88.62%, which is a very promising result. Therefore, with a small amount of adaptation data, we were able to create accurate individual HMM. We also introduce results of experiments, which show the effects of the number of iterations, the amount of adaptation data, and the regression tree classes.
Keywords
hidden Markov models; iterative methods; regression analysis; speech recognition; HMM retraining; NAM recognition; hidden Markov models; iterative supervised MLLR; iterative supervised adaptation; nonaudible murmur recognition; normal-speech clean initial models; recognition accuracy; regression tree classes; Head; Hidden Markov models; Iterative methods; Maximum likelihood linear regression; Microphones; Privacy; Regression tree analysis; Speech recognition; Training data; Working environment noise;
fLanguage
English
Publisher
ieee
Conference_Titel
Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on
Print_ISBN
0-7803-7980-2
Type
conf
DOI
10.1109/ASRU.2003.1318406
Filename
1318406
Link To Document