• DocumentCode
    3661595
  • Title

    VTLN Based Approaches for Speech Recognition with Very Limited Training Speakers

  • Author

    Sung Min Ban;Bo Kyung Choi;Young Ho Choi;Hyung Soon Kim

  • Author_Institution
    Dept. of Electron. Eng., Pusan Nat. Univ., Busan, South Korea
  • fYear
    2014
  • Firstpage
    285
  • Lastpage
    288
  • Abstract
    In this paper, two approaches using vocal tract length normalization (VTLN) are examined to deal with the acoustic mismatch due to different speakers in automatic speech recognition for the special case that training data is available only for a small number of speakers. One is the conventional VTLN approach in which both training and test utterances are frequency warped according to the maximum likelihood (ML) based warping factor estimation scheme, in order to normalize the speaker characteristics. The other approach is to build a virtually speaker-independent (SI) acoustic model using artificially generated multiple speaker data by VTLN based frequency warping of training utterances from the limited speakers. To compare the performance of the two approaches, Korean isolated word recognition experiments are performed with a small amount of training data from limited speakers. The experimental results show that the virtually SI acoustic model approach yields better performance than both the conventional VTLN approach and the baseline system in case of very limited training speakers.
  • Keywords
    "Acoustics","Silicon","Speech recognition","Mathematical model","Hidden Markov models","Speech","Training"
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Systems, Modelling and Simulation (ISMS), 2014 5th International Conference on
  • ISSN
    2166-0662
  • Type

    conf

  • DOI
    10.1109/ISMS.2014.55
  • Filename
    7280922