• DocumentCode
    417282
  • Title

    Hidden spectral peak trajectory model for phone classification

  • Author

    Lai, Yiu-Pong ; Siu, Man-Hung

  • Author_Institution
    Dept. of Electr. & Electron. Eng., Hong Kong Univ. of Sci. & Technol., China
  • Volume
    1
  • fYear
    2004
  • fDate
    17-21 May 2004
  • Abstract
    It is well known that spectrogram readers can classify different phones from their spectral-time characteristics, such as the formants. In this paper we present a novel acoustic model for phone classification based on the implicit estimation of the spectral peak trajectory as a polynomial time function. By making use of the known relationship between the spectral peak information and the cepstral coefficients, cepstral-based phone trajectories are built as functions of the hidden spectral trajectories. This captures the intuitive formant trajectories in the spectral domain while allowing speech modeling to be done in the more familiar cepstral domain. We have evaluated this hidden spectral peak trajectory model in both vowel classification and phone classification tasks. On a simple single Gaussian model, the hidden spectral peak trajectory model outperforms the HMM on both vowel and phone classification tasks. The new method can also be combined with the HMM model. This combination performs better than a more complex HMM with similar number of parameters.
  • Keywords
    Gaussian distribution; cepstral analysis; hidden Markov models; image classification; parameter estimation; speech processing; speech recognition; HMM model; acoustic model; cepstral-based phone trajectories; formant trajectories; hidden spectral peak trajectory model; implicit estimation; phone classification; polynomial time function; single Gaussian model; spectral domain; spectrogram readers; speech modeling; speech recognition; vowel classification; Acoustical engineering; Cepstral analysis; Hidden Markov models; Humans; Polynomials; Resonance; Shape; Spectrogram; Speech recognition; Visualization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-8484-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.2004.1326134
  • Filename
    1326134