• DocumentCode
    294646
  • Title

    Duration modeling in large vocabulary speech recognition

  • Author

    Anastasakos, Anastasios ; Schwartz, Richard ; Shu, Han

  • Author_Institution
    Northeastern Univ., Boston, MA, USA
  • Volume
    1
  • fYear
    1995
  • fDate
    9-12 May 1995
  • Firstpage
    628
  • Abstract
    This paper presents a study of different methods for phoneme duration modeling in large vocabulary speech recognition. We investigate the employment of phoneme duration and the effect of context, speaking rate and lexical stress in the duration of phoneme segments in a large vocabulary speech recognition system. The duration models are used in a postprocessing phase of BYBLOS, our baseline HMM-based recognition system, to rescore the N-Best hypotheses. We describe experiments with the 5 K word ARPA Wall Street Journal (WSJ) corpus. The results show that integration of duration models that take into account context and speaking rate can improve the word accuracy of the baseline recognition system
  • Keywords
    hidden Markov models; speech recognition; ARPA Wall Street Journal corpus; BYBLOS; N-Best hypotheses; baseline HMM-based recognition system; context; duration modeling; large vocabulary speech recognition; lexical stress; phoneme; postprocessing phase; speaking rate; word accuracy; Context modeling; Costs; Density functional theory; Employment; Hidden Markov models; Signal analysis; Speech analysis; Speech recognition; Stress; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on
  • Conference_Location
    Detroit, MI
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-2431-5
  • Type

    conf

  • DOI
    10.1109/ICASSP.1995.479676
  • Filename
    479676