• DocumentCode
    706303
  • Title

    Two novel visual voice activity detectors based on appearance models and retinal filtering

  • Author

    Aubrey, Andrew ; Rivet, Bertrand ; Hicks, Yulia ; Girin, Laurent ; Chambers, Jonathon ; Jutten, Christian

  • Author_Institution
    Centre of Digital Signal Process., Cardiff Univ., Cardiff, UK
  • fYear
    2007
  • fDate
    3-7 Sept. 2007
  • Firstpage
    2409
  • Lastpage
    2413
  • Abstract
    In this paper we present two novel methods for visual voice activity detection (V-VAD) which exploit the bimodality of speech (i.e. the coherence between speaker´s lips and the resulting speech). The first method uses appearance parameters of a speaker´s lips, obtained from an active appearance model (AAM). An HMM then dynamically models the change in appearance over time. The second method uses a retinal filter on the region of the lips to extract the required parameter. A corpus of a single speaker is applied to each method in turn, where each method is used to classify voice activity as speech or non speech. The efficiency of each method is evaluated individually using receiver operating characteristics and their respective performances are then compared and discussed. Both methods achieve a high correct silence detection rate for a small false detection rate.
  • Keywords
    filtering theory; hidden Markov models; object detection; speech processing; AAM; HMM; V-VAD; active appearance model; hidden Markov models; parameter extraction; receiver operating characteristics; retinal filtering; single speaker lips; small false detection rate; speech bimodality; visual voice activity detection; voice activity classification; Active appearance model; Hidden Markov models; Lips; Retina; Shape; Speech; Visualization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing Conference, 2007 15th European
  • Conference_Location
    Poznan
  • Print_ISBN
    978-839-2134-04-6
  • Type

    conf

  • Filename
    7099240