Title :
Exploiting temporal change of pitch in formant estimation
Author :
Wang, Tianyu T. ; Quatieri, Thomas F.
Author_Institution :
Lincoln Lab., Speech & Hearing Biosci. & Technol. Program, MIT, Hanscom AFB, MA
fDate :
March 31 2008-April 4 2008
Abstract :
This paper considers the problem of obtaining an accurate spectral representation of speech formant structure when the voicing source exhibits a high fundamental frequency. Our work is inspired by auditory perception and physiological modeling studies implicating the use of temporal changes in speech by humans. Specifically, we develop and assess signal processing schemes aimed at exploiting temporal change of pitch as a basis for formant estimation. Our methods are cast in a generalized framework of two-dimensional processing of speech and show quantitative improvements under certain conditions over representations derived from traditional and homomorphic linear prediction. We conclude by highlighting potential benefits of our framework in the particular application of speaker recognition with preliminary results indicating a performance gender-gap closure on subsets of the TIMIT corpus.
Keywords :
signal representation; speech processing; 2D speech processing; auditory perception; formant estimation; physiological modeling; pitch; signal processing; spectral representation; speech formant structure; temporal change; Auditory system; Frequency conversion; Frequency estimation; Humans; Laboratories; Power harmonic filters; Speaker recognition; Speech processing; Time frequency analysis; Two dimensional displays; auditory modeling; effects of pitch; formant estimation; source-filter model; speaker recognition;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location :
Las Vegas, NV
Print_ISBN :
978-1-4244-1483-3
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2008.4518513