• DocumentCode
    3244237
  • Title

    Accurate hidden Markov models for non-audible murmur (NAM) recognition based on iterative supervised adaptation

  • Author

    Heracleous, Panikos ; Nakajima, Yoshiki ; Lee, Akinobu ; Saruwatari, Hiroshi ; Shikano, Kiyohiro

  • Author_Institution
    Graduate Sch. of Inf. Sci., Nara Inst. of Sci. & Technol., Japan
  • fYear
    2003
  • fDate
    30 Nov.-3 Dec. 2003
  • Firstpage
    73
  • Lastpage
    76
  • Abstract
    In previous works, we introduced a special device (Non-Audible Murmur (NATM) microphone) able to detect very quietly uttered speech (murmur), which cannot be heard by listeners near the talker. Experimental results showed the efficiency of the device in NAM recognition. Using normal-speech monophone hidden Markov models (HMM) retrained with NAM data from a specific speaker, we could recognize NAM with high accuracy. Although the results were very promising, a serious problem is the HMM retraining, which requires a large amount of training data. In this paper, we introduce a new method for NAM recognition, which requires only a small amount of NAM data for training. The proposed method is based on supervised adaptation. The main difference from other adaptation approaches lies in the fact that instead of single-iteration adaptation, we use iterative adaptation (iterative supervised MLLR). Experiments prove the efficiency of the proposed method. Using normal-speech clean initial models and only 350 adaptation NAM utterances, we achieved a recognition accuracy of 88.62%, which is a very promising result. Therefore, with a small amount of adaptation data, we were able to create accurate individual HMM. We also introduce results of experiments, which show the effects of the number of iterations, the amount of adaptation data, and the regression tree classes.
  • Keywords
    hidden Markov models; iterative methods; regression analysis; speech recognition; HMM retraining; NAM recognition; hidden Markov models; iterative supervised MLLR; iterative supervised adaptation; nonaudible murmur recognition; normal-speech clean initial models; recognition accuracy; regression tree classes; Head; Hidden Markov models; Iterative methods; Maximum likelihood linear regression; Microphones; Privacy; Regression tree analysis; Speech recognition; Training data; Working environment noise;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on
  • Print_ISBN
    0-7803-7980-2
  • Type

    conf

  • DOI
    10.1109/ASRU.2003.1318406
  • Filename
    1318406