• DocumentCode
    2239054
  • Title

    Automatic Speech Activity Detection, Source Localization, and Speech Recognition on the Chil Seminar Corpus

  • Author

    Macho, Dusan ; Padrell, Jaume ; Abad, Alberto ; Nadeu, Climent ; Hernando, Javier ; McDonough, John ; Wolfel, Matthias ; Klee, Ulrich ; Omologo, Maurizio ; Brutti, Alessio ; Svaizer, Piergiorgio ; Potamianos, Gerasimos ; Chu, Stephen M.

  • Author_Institution
    TALP Res. Center, Univ. Politecnica de Catalunya, Barcelona
  • fYear
    2005
  • fDate
    6-6 July 2005
  • Firstpage
    876
  • Lastpage
    879
  • Abstract
    To realize the long-term goal of ubiquitous computing, technological advances in multi-channel acoustic analysis are needed in order to solve several basic problems, including speaker localization and tracking, speech activity detection (SAD) and distant-talking automatic speech recognition (ASR). The European Commission integrated project CHIL, "computers in the human interaction loop", aims to make significant advances in these three technologies. In this work, we report the results of our initial automatic source localization, speech activity detection, and speech recognition experiments on the CHIL seminar corpus, which is comprised of spontaneous speech collected by both near-and far-field microphones. In addition to the audio sensors, the seminars were also recorded by calibrated video cameras. This simultaneous audio-visual data capture enables the realistic evaluation of component technologies as was never possible with earlier data bases
  • Keywords
    audio signal processing; calibration; human computer interaction; microphone arrays; speaker recognition; video signal processing; CHIL seminar corpus; European Commission integrated project; audio sensor; audio-visual data capture; automatic SAD; automatic speech recognition; distant-talking ASR; far-field microphone; human-computer interaction loop; near-field microphone; speaker localization; speech activity detection; tracking model; video camera calibration; Acoustic signal detection; Automatic speech recognition; Humans; Loudspeakers; Microphones; Pervasive computing; Seminars; Speech analysis; Speech recognition; Ubiquitous computing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Multimedia and Expo, 2005. ICME 2005. IEEE International Conference on
  • Conference_Location
    Amsterdam
  • Print_ISBN
    0-7803-9331-7
  • Type

    conf

  • DOI
    10.1109/ICME.2005.1521563
  • Filename
    1521563