• DocumentCode
    698382
  • Title

    Lip feature extraction based on audio-visual correlation

  • Author

    Sargin, M.E. ; Erzin, E. ; Yemez, Y. ; Tekalp, A.M.

  • Author_Institution
    Multimedia Vision & Graphics Lab., Koc Univ., Istanbul, Turkey
  • fYear
    2005
  • fDate
    4-8 Sept. 2005
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    In this paper, the lip feature that has the highest correlation with audio features is investigated. Audio features are selected as Mel Frequency Cepstral Coefficients (MFCC) of the audio signal. Three different lip features are considered for the visual lip information, where these features are 2D DCT coefficients of the intensity based image and the optical flow vectors within the lip region, and the distances between pre-defined points on the lip contour which carries the lip shape information. In this study, we present two techniques based on class conditional probability analysis and canonical correlation analysis to estimate and compare the correlations between audio feature and each lip feature. The lip feature, which has the highest correlation to audio features, is identified among the above lip features. Isolation of lip features, which are highly correlated with audio signal, can be used for audio-visual speech recognition, audio-visual lip synchronization and estimation of lip shapes using audio signal for visual synthesis.
  • Keywords
    audio signal processing; correlation methods; discrete cosine transforms; feature extraction; image sequences; probability; speech recognition; vectors; 2D DCT coefficient; MFCC; audio signal; audio-visual correlation; audio-visual lip synchronization; audio-visual speech recognition; canonical correlation analysis; lip feature extraction; lip feature isolation; lip shape estimation; lip shape information; mel frequency cepstral coefficient; optical flow vector; probability analysis; visual lip information; visual synthesis; Correlation; Discrete cosine transforms; Estimation; Feature extraction; Optical imaging; Vectors; Visualization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing Conference, 2005 13th European
  • Conference_Location
    Antalya
  • Print_ISBN
    978-160-4238-21-1
  • Type

    conf

  • Filename
    7077967