DocumentCode
417674
Title
Characterization and extraction of mouth opening parameters available for audiovisual speech enhancement
Author
Berthommier, Frédéric
Author_Institution
Inst. de la Commun. Parlee, Inst. Nat. Polytech. de Grenoble, France
Volume
3
fYear
2004
fDate
17-21 May 2004
Abstract
The strong association existing between audio subband envelope parameters and video parameters extracted using the full DCT (discrete cosine transform) can be exploited for audiovisual speech enhancement, thanks to a good prediction of amplitude variations by a statistical model. Since the video parameter space is highly multidimensional, the causality of this association must be clarified. At first, a new method of retro-marking is proposed in order to build a transformation function of DCT parameters into explicit ABS mouth opening parameters. Secondly, a reduction to single parameter spaces is performed by selection of the best parameters. We show in two noisy conditions that the degradation of the enhancement performance due to the transformation and to the reduction is moderate.
Keywords
audio signal processing; audio-visual systems; discrete cosine transforms; feature extraction; speech enhancement; video signal processing; DCT; audio envelope parameters; audio subband envelope parameters; audiovisual speech enhancement; discrete cosine transform; mouth opening parameter characterization; mouth opening parameter extraction; retro-marking; video parameters; Audio databases; Discrete cosine transforms; Filter bank; Image databases; Mouth; Noise level; Spatial databases; Speech enhancement; Speech recognition; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
ISSN
1520-6149
Print_ISBN
0-7803-8484-9
Type
conf
DOI
10.1109/ICASSP.2004.1326663
Filename
1326663
Link To Document