Frequency-warping and speaker-normalization

Author

Umesh, S. ; Cohen, L. ; Nelson, D.

Author_Institution

Dept. of Electr. Eng., Indian Inst. of Technol., Kanpur, India

Volume

2

fYear

1997

fDate

21-24 Apr 1997

Firstpage

983

Abstract

We have proposed the use of scale-cepstral coefficients as features in speech recognition. We have developed a corresponding frequency-warping function, such that, in the warped domain the formant envelopes of different speakers are approximately translated versions of one and another for any given vowel. These methods were motivated by a desire to achieve speaker-normalization. In this paper, we point out very interesting parallels of the various steps in computing the scale-cepstrum, with those observed in computing features based on physiological models of the auditory system or psychoacoustic experiments. It may therefore be useful to have a better understanding of the need for the various signal-processing steps which may result in the development of more robust recognizers

Keywords

acoustic signal processing; cepstral analysis; hearing; physiological models; speech processing; speech recognition; auditory system; formant envelopes; frequency warping function; physiological models; psychoacoustic experiments; robust speech recognizers; scale cepstral coefficients; scale cepstrum; signal processing; speaker normalization; speech features; speech recognition; vowel; warped domain; Acoustic scattering; Auditory system; Concurrent computing; Educational institutions; Fourier transforms; Frequency dependence; Psychoacoustic models; Psychology; Robustness; Speech recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on

Conference_Location

Munich

ISSN

1520-6149

Print_ISBN

0-8186-7919-0

Type

conf

DOI

10.1109/ICASSP.1997.596104

Filename

596104