Title :
Adaptive Kalman Filtering and Smoothing for Tracking Vocal Tract Resonances Using a Continuous-Valued Hidden Dynamic Model
Author :
Deng, Li ; Lee, Leo J. ; Attias, Hagai ; Acero, Alex
Author_Institution :
Microsoft Res., Redmond, WA
fDate :
6/29/1905 12:00:00 AM
Abstract :
A novel Kalman filtering/smoothing algorithm is presented for efficient and accurate estimation of vocal tract resonances or formants, which are natural frequencies and bandwidths of the resonator from larynx to lips, in fluent speech. The algorithm uses a hidden dynamic model, with a state-space formulation, where the resonance frequency and bandwidth values are treated as continuous-valued hidden state variables. The observation equation of the model is constructed by an analytical predictive function from the resonance frequencies and bandwidths to LPC cepstra as the observation vectors. This nonlinear function is adaptively linearized, and a residual or bias term, which is adaptively trained, is added to the nonlinear function to represent the iteratively reduced piecewise linear approximation error. Details of the piecewise linearization design process are described. An iterative tracking algorithm is presented, which embeds both the adaptive residual training and piecewise linearization design in the Kalman filtering/smoothing framework. Experiments on estimating resonances in Switchboard speech data show accurate estimation results. In particular, the effectiveness of the adaptive residual training is demonstrated. Our approach provides a solution to the traditional "hidden formant problem," and produces meaningful results even during consonantal closures when the supra-laryngeal source may cause no spectral prominences in speech acoustics
Keywords :
Kalman filters; adaptive filters; approximation theory; iterative methods; smoothing methods; speech processing; LPC cepstra; adaptive Kalman filtering; adaptive residual training; continuous-valued hidden dynamic model; iterative tracking algorithm; piecewise linear approximation error; piecewise linearization design; speech acoustics; state-space formulation; supra-laryngeal source; vocal tract resonances smoothing; vocal tract resonances tracking; Adaptive filters; Bandwidth; Filtering algorithms; Frequency estimation; Kalman filters; Resonance; Resonant frequency; Resonator filters; Smoothing methods; Speech; Adaptive piecewise linearization; adaptive residual parameter learning; continuous dynamics; formant analysis; hidden dynamic model; nonlinear prediction; speech processing; state-space model; vocal tract resonance;
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
DOI :
10.1109/TASL.2006.876724