DocumentCode
310561
Title
Frequency-warping and speaker-normalization
Author
Umesh, S. ; Cohen, L. ; Nelson, D.
Author_Institution
Dept. of Electr. Eng., Indian Inst. of Technol., Kanpur, India
Volume
2
fYear
1997
fDate
21-24 Apr 1997
Firstpage
983
Abstract
We have proposed the use of scale-cepstral coefficients as features in speech recognition. We have developed a corresponding frequency-warping function, such that, in the warped domain the formant envelopes of different speakers are approximately translated versions of one and another for any given vowel. These methods were motivated by a desire to achieve speaker-normalization. In this paper, we point out very interesting parallels of the various steps in computing the scale-cepstrum, with those observed in computing features based on physiological models of the auditory system or psychoacoustic experiments. It may therefore be useful to have a better understanding of the need for the various signal-processing steps which may result in the development of more robust recognizers
Keywords
acoustic signal processing; cepstral analysis; hearing; physiological models; speech processing; speech recognition; auditory system; formant envelopes; frequency warping function; physiological models; psychoacoustic experiments; robust speech recognizers; scale cepstral coefficients; scale cepstrum; signal processing; speaker normalization; speech features; speech recognition; vowel; warped domain; Acoustic scattering; Auditory system; Concurrent computing; Educational institutions; Fourier transforms; Frequency dependence; Psychoacoustic models; Psychology; Robustness; Speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on
Conference_Location
Munich
ISSN
1520-6149
Print_ISBN
0-8186-7919-0
Type
conf
DOI
10.1109/ICASSP.1997.596104
Filename
596104
Link To Document