DocumentCode :
3391153
Title :
A speech feature based on Bark frequency warping-the non-uniform linear prediction (NLP) cepstrum
Author :
Kim, Yoon ; Smith, Julius O., III
Author_Institution :
CCRMA, Stanford Univ., CA, USA
fYear :
1999
fDate :
1999
Firstpage :
131
Lastpage :
134
Abstract :
We propose a new method of obtaining features from speech signals for robust analysis and recognition-the non-uniform linear prediction (NLP) cepstrum. The objective is to derive a representation that suppresses speaker-dependent characteristics while preserving the linguistic quality of speech segments. The analysis is based on two principles. First, Bark frequency warping is performed on the LP spectrum to emulate the auditory spectrum. While widely used methods such as the mel-frequency and PLP analysis use the FFT spectrum as its basis for warping, the NLP analysis uses the LP-based vocal-tract spectrum with glottal effects removed. Second, all-pole modeling (LP) is used before and after the warping. The pre-warp LP is used to first obtain the vocal-tract spectrum, while the post-warp LP is performed to obtain a smoothed, two-peak model of the warped spectrum. Experiments were conducted to test the effectiveness of the proposed feature in the case of identification/discrimination of vowels uttered by multiple speakers using linear discriminant analysis (LDA), and frame-based vowel recognition with a statistical model. In both cases, the NLP analysis was shown to be an effective tool for speaker-independent speech analysis/recognition applications
Keywords :
hidden Markov models; poles and zeros; prediction theory; spectral analysis; speech intelligibility; speech processing; speech recognition; statistical analysis; transforms; Bark frequency warping; FFT spectrum; LP spectrum; LP-based vocal-tract spectrum; NLP analysis; PLP analysis; all-pole modeling; auditory spectrum; experiments; frame-based HMM vowel recognition; frame-based vowel recognition; linear discriminant analysis; linguistic quality; mel-frequency; non-uniform linear prediction cepstrum; post-warp LP; pre-warp LP; smoothed two-peak model; speaker-dependent characteristics suppression; speaker-independent speech analysis; speaker-independent speech recognition; speech feature; speech segments; speech signals; statistical model; vowel identification/discrimination; Auditory system; Cepstrum; Ear; Frequency; Humans; Interpolation; Linear discriminant analysis; Power harmonic filters; Speech analysis; Speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Applications of Signal Processing to Audio and Acoustics, 1999 IEEE Workshop on
Conference_Location :
New Paltz, NY
Print_ISBN :
0-7803-5612-8
Type :
conf
DOI :
10.1109/ASPAA.1999.810867
Filename :
810867
Link To Document :
بازگشت