• DocumentCode
    394753
  • Title

    Audiovisual-based adaptive speaker identification

  • Author

    Li, Ying ; Narayanan, Shrikanth ; Kuo, C. C Jay

  • Author_Institution
    Dept. of Electr. Eng., Univ. of Southern California, Los Angeles, CA, USA
  • Volume
    5
  • fYear
    2003
  • fDate
    6-10 April 2003
  • Abstract
    An adaptive speaker identification system is presented in this paper, which aims to recognize speakers in feature films by exploiting both audio and visual cues. Specifically, the audio source is first analyzed to identify speakers using a likelihood-based approach. Meanwhile, the visual source is parsed to recognize talking faces using face detection/recognition and mouth tracking techniques. These two information sources are then integrated under a probabilistic framework for improved system performance. Moreover, to account for speakers´ voice variations along time, we update their acoustic models on the fly by adapting to their newly contributed speech data. An average of 80% identification accuracy has been achieved on two test movies. This shows a promising future for the proposed audiovisual-based adaptive speaker identification approach.
  • Keywords
    adaptive signal processing; audio signal processing; face recognition; maximum likelihood estimation; probability; speaker recognition; video signal processing; acoustic model updating; adaptive speaker identification; audio cues; audio source analysis; audiovisual-based speaker identification; face detection/recognition; feature films; identification accuracy; likelihood-based approach; mouth tracking; probabilistic framework; speaker recognition; system performance; talking faces; visual cues; visual source parsing; voice variations; Adaptive systems; Databases; Face detection; Face recognition; Loudspeakers; Motion pictures; Mouth; Speech; System performance; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7663-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.2003.1200095
  • Filename
    1200095