• DocumentCode
    1720928
  • Title

    A DOA Based Speaker Diarization System for Real Meetings

  • Author

    Araki, Shoko ; Fujimoto, Masakiyo ; Ishizuka, Kentaro ; Sawada, Hiroshi ; Makino, Shoji

  • Author_Institution
    NTT Commun. Sci. Labs., NTT Corp., Kyoto
  • fYear
    2008
  • Firstpage
    29
  • Lastpage
    32
  • Abstract
    This paper presents a speaker diarization system that estimates who spoke when in a meeting. Our proposed system is realized by using a noise robust voice activity detector (VAD), a direction of arrival (DOA) estimator, and a DOA classifier. Our previous system utilized the generalized cross correlation method with the phase transform (GCC-PHAT) approach for the DOA estimation. Because the GCC-PHAT can estimate just one DOA per frame, it was difficult to handle speaker overlaps. This paper tries to deal with this issue by employing a DOA at each time-frequency slot (TFDOA), and reports how it improves diarization performance for real meetings / conversations recorded in a room with a reverberation time of 350 ms.
  • Keywords
    direction-of-arrival estimation; speaker recognition; time-frequency analysis; direction of arrival estimation; generalized cross correlation method; phase transform; real meeting; speaker diarization system; time 350 ms; time-frequency slot DOA estimation; voice activity detector; Correlation; Detectors; Direction of arrival estimation; Laboratories; Microphones; Noise robustness; Phase estimation; Reverberation; Speech; Time frequency analysis; diarization; direction of arrival; voice activity detector;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Hands-Free Speech Communication and Microphone Arrays, 2008. HSCMA 2008
  • Conference_Location
    Trento
  • Print_ISBN
    978-1-4244-2337-8
  • Electronic_ISBN
    978-1-4244-2338-5
  • Type

    conf

  • DOI
    10.1109/HSCMA.2008.4538680
  • Filename
    4538680