DocumentCode :
417282
Title :
Hidden spectral peak trajectory model for phone classification
Author :
Lai, Yiu-Pong ; Siu, Man-Hung
Author_Institution :
Dept. of Electr. & Electron. Eng., Hong Kong Univ. of Sci. & Technol., China
Volume :
1
fYear :
2004
fDate :
17-21 May 2004
Abstract :
It is well known that spectrogram readers can classify different phones from their spectral-time characteristics, such as the formants. In this paper we present a novel acoustic model for phone classification based on the implicit estimation of the spectral peak trajectory as a polynomial time function. By making use of the known relationship between the spectral peak information and the cepstral coefficients, cepstral-based phone trajectories are built as functions of the hidden spectral trajectories. This captures the intuitive formant trajectories in the spectral domain while allowing speech modeling to be done in the more familiar cepstral domain. We have evaluated this hidden spectral peak trajectory model in both vowel classification and phone classification tasks. On a simple single Gaussian model, the hidden spectral peak trajectory model outperforms the HMM on both vowel and phone classification tasks. The new method can also be combined with the HMM model. This combination performs better than a more complex HMM with similar number of parameters.
Keywords :
Gaussian distribution; cepstral analysis; hidden Markov models; image classification; parameter estimation; speech processing; speech recognition; HMM model; acoustic model; cepstral-based phone trajectories; formant trajectories; hidden spectral peak trajectory model; implicit estimation; phone classification; polynomial time function; single Gaussian model; spectral domain; spectrogram readers; speech modeling; speech recognition; vowel classification; Acoustical engineering; Cepstral analysis; Hidden Markov models; Humans; Polynomials; Resonance; Shape; Spectrogram; Speech recognition; Visualization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
ISSN :
1520-6149
Print_ISBN :
0-7803-8484-9
Type :
conf
DOI :
10.1109/ICASSP.2004.1326134
Filename :
1326134
Link To Document :
بازگشت