• DocumentCode
    311014
  • Title

    Inter-digit HMM: connected digit recognition using the Macrophone corpus

  • Author

    Kao, Yu-Hung ; Netsch, Lorin

  • Author_Institution
    Texas Instrum. Inc., Dallas, TX, USA
  • Volume
    3
  • fYear
    1997
  • fDate
    21-24 Apr 1997
  • Firstpage
    1739
  • Abstract
    Continuous digit recognition over the telephone channel is a key technology for many telecommunications applications such as voice dialing, automatic banking, and credit card number entry. Speech recognizers usually achieve high performance by modeling the acoustics in hidden Markov models (HMMs) using large numbers of multivariate Gaussian mixtures with assumed diagonal covariance in order to model the variability of different speakers and channel conditions. We present a system that uses single mixture 16 feature Gaussian distribution with an assumed identity covariance to achieve a 1.0% word error and a 5.7% sentence error rate on the Macrophone corpus. We found that inter-digit modeling, discriminant training, and per-utterance adaptation can each contribute about a 30% reduction in the error rate. Using this approach, we can realize a system with relatively low memory requirements
  • Keywords
    Gaussian distribution; acoustic signal processing; hidden Markov models; speech processing; speech recognition; telecommunication channels; telephony; Gaussian distributions; Macrophone corpus; acoustics modeling; automatic banking; channel conditions; connected digit recognition; credit card number entry; diagonal covariance; discriminant training; error rate reduction; hidden Markov models; identity covariance; interdigit HMM; interdigit modeling; multivariate Gaussian mixtures; per-utterance adaptation; sentence error rate; speech recognizers; telecommunications applications; telephone channel; voice dialing; word error rate; Acoustics; Automatic speech recognition; Banking; Credit cards; Error analysis; Gaussian distribution; Hidden Markov models; Loudspeakers; Speech recognition; Telephony;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on
  • Conference_Location
    Munich
  • ISSN
    1520-6149
  • Print_ISBN
    0-8186-7919-0
  • Type

    conf

  • DOI
    10.1109/ICASSP.1997.598860
  • Filename
    598860