مرکز منطقه ای اطلاع رساني علوم و فناوري - A speech feature based on Bark frequency warping-the non-uniform linear prediction (NLP) cepstrum

DocumentCode :

3391153

Title :

A speech feature based on Bark frequency warping-the non-uniform linear prediction (NLP) cepstrum

Author :

Kim, Yoon ; Smith, Julius O., III

Author_Institution :

CCRMA, Stanford Univ., CA, USA

fYear :

1999

fDate :

1999

Firstpage :

131

Lastpage :

134

Abstract :

We propose a new method of obtaining features from speech signals for robust analysis and recognition-the non-uniform linear prediction (NLP) cepstrum. The objective is to derive a representation that suppresses speaker-dependent characteristics while preserving the linguistic quality of speech segments. The analysis is based on two principles. First, Bark frequency warping is performed on the LP spectrum to emulate the auditory spectrum. While widely used methods such as the mel-frequency and PLP analysis use the FFT spectrum as its basis for warping, the NLP analysis uses the LP-based vocal-tract spectrum with glottal effects removed. Second, all-pole modeling (LP) is used before and after the warping. The pre-warp LP is used to first obtain the vocal-tract spectrum, while the post-warp LP is performed to obtain a smoothed, two-peak model of the warped spectrum. Experiments were conducted to test the effectiveness of the proposed feature in the case of identification/discrimination of vowels uttered by multiple speakers using linear discriminant analysis (LDA), and frame-based vowel recognition with a statistical model. In both cases, the NLP analysis was shown to be an effective tool for speaker-independent speech analysis/recognition applications

Keywords :

hidden Markov models; poles and zeros; prediction theory; spectral analysis; speech intelligibility; speech processing; speech recognition; statistical analysis; transforms; Bark frequency warping; FFT spectrum; LP spectrum; LP-based vocal-tract spectrum; NLP analysis; PLP analysis; all-pole modeling; auditory spectrum; experiments; frame-based HMM vowel recognition; frame-based vowel recognition; linear discriminant analysis; linguistic quality; mel-frequency; non-uniform linear prediction cepstrum; post-warp LP; pre-warp LP; smoothed two-peak model; speaker-dependent characteristics suppression; speaker-independent speech analysis; speaker-independent speech recognition; speech feature; speech segments; speech signals; statistical model; vowel identification/discrimination; Auditory system; Cepstrum; Ear; Frequency; Humans; Interpolation; Linear discriminant analysis; Power harmonic filters; Speech analysis; Speech recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Applications of Signal Processing to Audio and Acoustics, 1999 IEEE Workshop on

Conference_Location :

New Paltz, NY

Print_ISBN :

0-7803-5612-8

Type :

conf

DOI :

10.1109/ASPAA.1999.810867

Filename :

810867

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3391153