• DocumentCode
    775102
  • Title

    A stochastic segment model for phoneme-based continuous speech recognition

  • Author

    Ostendorf, Mari ; Roukos, Salim

  • Author_Institution
    Dept. of Electr., Comput. & Syst. Eng., Boston Univ., MA, USA
  • Volume
    37
  • Issue
    12
  • fYear
    1989
  • fDate
    12/1/1989 12:00:00 AM
  • Firstpage
    1857
  • Lastpage
    1869
  • Abstract
    The authors introduce a novel approach to modeling variable-duration phonemes, called the stochastic segment model. A phoneme X is observed as a variable-length sequence of frames, where each frame is represented by a parameter vector and the length of the sequence is random. The stochastic segment model consists of (1) a time warping of the variable-length segment X into a fixed-length segment Y called a resampled segment and (2) a joint density function of the parameters of X which in this study is a Gaussian density. The segment model represents spectra/temporal structure over the entire phoneme. The model also allows the incorporation in Y of acoustic-phonetic features derived from X, in addition to the usual spectral features that have been used in hidden Markov modeling and dynamic time warping approaches to speech recognition. The authors describe the stochastic segment model, the recognition algorithm, and an iterative training algorithm for estimating segment models from continuous speech. They present several results using segment models in two speaker-dependent recognition tasks and compare the performance of the stochastic segment model to the performance of the hidden Markov models
  • Keywords
    speech recognition; stochastic processes; Gaussian density; dynamic time warping; fixed-length segment; hidden Markov modeling; iterative training algorithm; joint density function; parameter vector; phoneme-based continuous speech recognition; stochastic segment model; variable-duration phonemes; variable-length segment; Acoustics; Density functional theory; Dictionaries; Hidden Markov models; Iterative algorithms; Loudspeakers; Robustness; Speech recognition; Stochastic processes; Vocabulary;
  • fLanguage
    English
  • Journal_Title
    Acoustics, Speech and Signal Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0096-3518
  • Type

    jour

  • DOI
    10.1109/29.45533
  • Filename
    45533