Title :
Cross-modal prediction in audio-visual communication
Author :
Rao, Rohini R. ; Chen, Tsuhan
Author_Institution :
Georgia Inst. of Technol., Atlanta, GA, USA
Abstract :
We present a novel means for predicting the shape of a person´s mouth from the corresponding speech signal and explore applications of this prediction to video coding. The prediction is accomplished by modeling the probability distribution of the audiovisual features by a Gaussian mixture density. The optimal estimate for the visual features given the acoustic features can then be computed using this probability distribution. The ability to predict a person´s mouth shape from the corresponding audio leads to a number of interesting joint audio-video coding strategies. In the cross-modal predictive coding system described, a model-based video coder compares measured visual parameters with predicted visual parameters, and sends the difference between the two to the receiver. Since the decoder also receives the acoustic data, it can form the prediction and then reconstruct the original parameters by adding the transmitted error signal
Keywords :
Gaussian distribution; acoustic signal processing; audio coding; audio-visual systems; correlation methods; decoding; parameter estimation; prediction theory; speech processing; video coding; Gaussian mixture density; acoustic data; acoustic features; audio-visual communication; audio-visual correlation; audiovisual features; cross-modal prediction; cross-modal predictive coding system; decoder; joint audio-video coding; measured visual parameters; model based video coder; mouth shape prediction; nonlinear estimation; optimal estimate; parameters reconstruction; predicted visual parameters; probability distribution; receiver; speech signal; transmitted error signal; video coding; visual features; Acoustic measurements; Decoding; Distributed computing; Mouth; Predictive coding; Predictive models; Probability distribution; Shape; Speech coding; Video coding;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1996. ICASSP-96. Conference Proceedings., 1996 IEEE International Conference on
Conference_Location :
Atlanta, GA
Print_ISBN :
0-7803-3192-3
DOI :
10.1109/ICASSP.1996.545722