DocumentCode :
3340299
Title :
Evaluation of mel-LPC cepstrum in a large vocabulary continuous speech recognition
Author :
Matsumoto, Harosha ; Moroto, Masanori
Author_Institution :
Fac. of Eng., Shinshu Univ., Nagano, Japan
Volume :
1
fYear :
2001
fDate :
2001
Firstpage :
117
Abstract :
This paper presents a simple and efficient time domain technique to estimate an all-pole model on the mel-frequency scale (mel-LPC), and compares the recognition performance of the mel-LPC cepstrum with those of both the standard LPC mel-cepstrum and the MFCC (mel-frequency cepstral coefficient) through the Japanese dictation system (Julius) with 20,000 word vocabulary. First, the optimal value of the frequency warping factor is examined in terms of monosyllable accuracy. When using the optimal warping factors, the mel-LPC cepstrum attains word accuracies of 93.0% for male speakers and 93.1% for female speakers, which are 2.1% and 1.7% higher than those of the LPC mel-cepstrum, respectively. Furthermore, this performance is slightly superior to that of MFCC
Keywords :
cepstral analysis; linear predictive coding; speech coding; speech recognition; time-domain analysis; Japanese dictation system; Julius; MFCC; all-pole model; female speakers; frequency warping factor; large vocabulary continuous speech recognition; male speakers; mel frequency scale; mel-LPC cepstrum; monosyllable accuracy; recognition performance; time domain technique; word accuracies; Automatic speech recognition; Cepstral analysis; Cepstrum; Frequency conversion; Linear predictive coding; Mel frequency cepstral coefficient; Psychoacoustic models; Spectral analysis; Speech recognition; Vocabulary;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on
Conference_Location :
Salt Lake City, UT
ISSN :
1520-6149
Print_ISBN :
0-7803-7041-4
Type :
conf
DOI :
10.1109/ICASSP.2001.940781
Filename :
940781
Link To Document :
بازگشت