• DocumentCode
    2951363
  • Title

    Approaches and applications of audio diarization

  • Author

    Reynolds, D.A. ; Torres-Carrasquillo, P.

  • Author_Institution
    Lincoln Lab., MIT, Lexington, MA, USA
  • Volume
    5
  • fYear
    2005
  • fDate
    18-23 March 2005
  • Abstract
    Audio diarization is the process of annotating an input audio channel with information that attributes (possibly overlapping) temporal regions of signal energy to their specific sources. These sources can include particular speakers, music, background noise sources, and other signal source/channel characteristics. Diarization has utility in making automatic transcripts more readable and in searching and indexing audio archives. In this paper, we provide an overview of current audio diarization approaches and discuss performance and potential applications. We outline the general framework of diarization systems and present the performance of current systems as measured in the DARPA EARS Rich Transcription Fall 2004 (RT-04F) speaker diarization evaluation. Lastly, we look at future challenges and directions for diarization research.
  • Keywords
    audio signal processing; signal classification; speech processing; audio archive indexing; audio archive searching; audio channel annotation; audio diarization; audio source categorization; automatic transcripts; background noise sources; meta-data; music; signal energy temporal region source determination; speaker speaker; speech detection; Acoustic noise; Audio recording; Bandwidth; Broadcasting; Data mining; Indexing; NIST; Speech analysis; Speech enhancement; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2005. Proceedings. (ICASSP '05). IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-8874-7
  • Type

    conf

  • DOI
    10.1109/ICASSP.2005.1416463
  • Filename
    1416463