Title :
A comparative study of five spectral representations for speaker-independent phonetic recognition
Author :
Creekmore, Joseph W. ; Fanty, Mark ; Cole, Ronald A.
Author_Institution :
MITRE Corp., McLean, VA, USA
Abstract :
The authors describe a comparative study of five spectral representations for speaker-independent phonetic recognition using the TIMIT database. A feedforward network was trained to classify 20-ms frames of speech as one of 39 phonetic classes derived from the TIMIT database. The five representations investigated include the discrete Fourier transform, three representations based on conventional linear predictive coding (LPC), and the cepstral coefficients derived from perceptual linear predictive (PLP) analysis. The PLP cepstral coefficients outperformed the other representations on the task of assigning the correct phonetic label to individual time frames. It is shown that phonetic context can be exploited by providing spectral information before and after the frame to be classified. The effect of the training set size and distribution is also examined
Keywords :
fast Fourier transforms; neural nets; spectral analysis; speech analysis and processing; speech recognition; 20 ms; TIMIT database; cepstral coefficients; comparative study; discrete Fourier transform; feedforward network; linear predictive coding; neural nets; perceptual linear predictive analysis; phonetic classes; phonetic context; speaker-independent phonetic recognition; spectral representations; speech frames; training set size; Cepstral analysis; Computational modeling; Discrete Fourier transforms; Fourier transforms; Linear predictive coding; Signal processing; Signal representations; Speech analysis; Speech processing; Speech recognition;
Conference_Titel :
Signals, Systems and Computers, 1991. 1991 Conference Record of the Twenty-Fifth Asilomar Conference on
Conference_Location :
Pacific Grove, CA
Print_ISBN :
0-8186-2470-1
DOI :
10.1109/ACSSC.1991.186467