DocumentCode :
661270
Title :
Vocal tract length estimation for voiced and whispered speech using gammachirp filterbank
Author :
Irino, Toshio ; Okamoto, Eiji ; Nisimura, Ryuichi ; Kawahara, H.
Author_Institution :
Fac. of Syst. Eng., Wakayama Univ., Wakayama, Japan
fYear :
2013
fDate :
Oct. 29 2013-Nov. 1 2013
Firstpage :
1
Lastpage :
4
Abstract :
In this paper, we demonstrate an auditory spectrogram based on a dynamic compressive gammachirp filterbank (GCFB) that enables accurate and robust estimation of vocal tract length (VTL) for both voiced and whispered speech. Normalized VTLs of 21 speakers were derived by using the least squared analysis of their VTL ratios (for all permutations, 420 = 21P20) which were estimated by minimizing spectral distances in the auditory spectrograms. The frequency range was selected in the calculation and the range between 500 and 5000 (Hz) was most reasonable for both speech mode. The method based on GCFB was better than that based on the mel-frequency filterbank (MFFB). The estimated VTLs were compared with the VTL data measured in MRI to confirm the reliability.
Keywords :
least squares approximations; spectrometers; speech processing; auditory spectrograms; dynamic compressive gammachirp filterbank; least squared analysis; spectral distances; vocal tract length estimation; voiced speech; whispered speech; Estimation error; Frequency estimation; Magnetic resonance imaging; Spectrogram; Speech; Vectors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2013 Asia-Pacific
Conference_Location :
Kaohsiung
Type :
conf
DOI :
10.1109/APSIPA.2013.6694131
Filename :
6694131
Link To Document :
بازگشت