Effect of different sampling rates and feature vector sizes on speech recognition performance

Author

Ssnderson, C. ; Paliwal, Kuldip K.

Author_Institution

Sch. of Microelectron. Eng., Griffith Univ., Brisbane, Qld., Australia

Volume

1

fYear

1997

fDate

4-4 Dec. 1997

Firstpage

161

Abstract

We conduct a systematic study to evaluate the effect of the sampling rate and feature vector size on the performance of a hidden Markov model (HMM) based speech recognizer. We investigate the use of the following two types of features: linear prediction (LP) derived cepstral coefficients (LPCC) and Mel frequency cepstral coefficients (MFCC). We demonstrate that for the LPCC front-end, the optimum sampling rate and feature vector size are 12 kHz and 14, respectively. We also show that for different sampling rates, the accuracy peaks at different sizes of the feature vector. For the MFCC front-end, the optimum feature vector size and sampling rate are 14 and 14 kHz, respectively.

Keywords

cepstral analysis; hidden Markov models; prediction theory; signal sampling; speech recognition; 12 kHz; 14 kHz; HMM; Mel frequency cepstral coefficients; feature vector size; hidden Markov model; linear prediction cepstral coefficients; sampling rates; speech recognition performance; Australia; Cepstral analysis; Databases; Hidden Markov models; Mel frequency cepstral coefficient; Microelectronics; Sampling methods; Speech analysis; Speech recognition; Testing;

fLanguage

English

Publisher

ieee

Conference_Titel

TENCON '97. IEEE Region 10 Annual Conference. Speech and Image Technologies for Computing and Telecommunications., Proceedings of IEEE

Conference_Location

Brisbane, Qld., Australia

Print_ISBN

0-7803-4365-4

Type

conf

DOI

10.1109/TENCON.1997.647282

Filename

647282