• DocumentCode
    394264
  • Title

    The ICSI Meeting Corpus

  • Author

    Janin, Adam ; Baron, Don ; Edwards, Jane ; Ellis, Dan ; Gelbart, David ; Morgan, Nelson ; Peskin, Barbara ; Pfau, Thilo ; Shriberg, Elizabeth ; Stolcke, Andreas ; Wooters, Chuck

  • Author_Institution
    Int. Comput. Sci. Inst., Berkeley, CA, USA
  • Volume
    1
  • fYear
    2003
  • fDate
    6-10 April 2003
  • Abstract
    We have collected a corpus of data from natural meetings that occurred at the International Computer Science Institute (ICSI) in Berkeley, California over the last three years. The corpus contains audio recorded simultaneously from head-worn and table-top microphones, word-level transcripts of meetings, and various metadata on participants, meetings, and hardware. Such a corpus supports work in automatic speech recognition, noise robustness, dialog modeling, prosody, rich transcription, information retrieval, and more. We present details on the contents of the corpus, as well as rationales for the decisions that led to its configuration. The corpus were delivered to the Linguistic Data Consortium (LDC).
  • Keywords
    audio recording; microphones; speech processing; speech recognition; Berkeley; California; ICSI; ICSI Meeting Corpus; International Computer Science Institute; Linguistic Data Consortium; audio recordings; automatic speech recognition; data corpus; dialog modeling; head-worn microphones; information retrieval; noise robustness; prosody; table-top microphones; transcription;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03). 2003 IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7663-3
  • Type

    conf

  • DOI
    10.1109/ICASSP.2003.1198793
  • Filename
    1198793