DocumentCode
730727
Title
Joint estimation of vocal tract and nasal tract area functions from speech waveforms via auto-regression moving-average modeling and a pole assignment method
Author
Shang-Hsuan Peng ; Chao-Wen Li ; Yi-Wen Liu
Author_Institution
Dept. Electr. Eng., Nat. Tsing Hua Univ., Hsinchu, Taiwan
fYear
2015
fDate
19-24 April 2015
Firstpage
4644
Lastpage
4648
Abstract
Nasal resonance is utilized in certain languages to differentiate word meanings. The joint filtering effect by the vocal tract and the nasal tract can be modeled by the auto-regression moving-average (ARMA) approach. However, unlike all-pole (i.e., AR) modeling, it has been difficult to derive the equivalent vocal-tract area function directly from an ARMA model due to the nonlinear nature in the relation between model coefficients and vocal-tract geometry. In this paper, we propose a method to decompose an ARMA model approximately into α/C(z) + β/D(z); in our context, 1/C(z) and 1/D(z) represent the filtering effects of the oral and the nasal tract, respectively. Once the decomposition is performed, equivalent oral-tract and nasal-tract area functions can be obtained by converting C(z) and D(z) to their respective lattice representation. The proposed method was applied to non-nasalized and nasalized vowels produced by three speakers, and it was found that the ratio r = β/α tends to be higher in nasalized vowels than in their non-nasalized counterparts. The vocal-tract area function estimated by the present approach was also fairly stable for sustained vowels.
Keywords
approximation theory; autoregressive moving average processes; computational geometry; filtering theory; lattice theory; pole assignment; speaker recognition; speech coding; ARMA model; autoregression moving-average modeling; equivalent oral-tract area functions; joint filtering effect; joint nasal tract area function estimation; joint vocal tract area function estimation; lattice representation; model coefficients; nasal resonance; nasalized vowels; nonnasalized vowels; pole assignment method; speech coding; speech waveforms; vocal-tract geometry; word meaning differentiation; Acoustics; Atmospheric modeling; Electron tubes; Estimation; Lattices; Speech; Transfer functions; ARMA modeling; Speech; nasalization; vocal-tract area function;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location
South Brisbane, QLD
Type
conf
DOI
10.1109/ICASSP.2015.7178851
Filename
7178851
Link To Document