Title :
Online meeting recognizer with multichannel speaker diarization
Author :
Araki, Shoko ; Hori, Takaaki ; Fujimoto, Masakiyo ; Watanabe, Shinji ; Yoshioka, Takuya ; Nakatani, Tomohiro ; Nakamura, Atsushi
Author_Institution :
NTT Commun. Sci. Labs., NTT Corp., Kyoto, Japan
Abstract :
We present our newly developed real-time conversation analyzer for group meetings. The goal of the system is to estimate automatically “who speaks when and what” in an online manner. In our system, “who speaks when” information is first obtained by estimating the directions of arrival (DOAs) of signals. Then, “who speaks what” is estimated with our automatic speech recognition (ASR) system, after suppressing reverberation, background noise, and interference speakers´ voices. In this paper, we focus particularly on the speaker diarization (“who speaks when” estimation) method, and we show that the speaker diarization information helps the ASR to reduce insertion errors.
Keywords :
direction-of-arrival estimation; signal denoising; speaker recognition; ASR system; DOA estimation; automatic speech recognition system; background noise; directions of arrival estimation; insertion error reduction; interference speaker voice suppression; multichannel speaker diarization; online meeting recognizer; real-time conversation analyzer; reverberation suppression; Adaptation model; Microphones; Noise; Speech; Speech enhancement; Speech recognition;
Conference_Titel :
Signals, Systems and Computers (ASILOMAR), 2010 Conference Record of the Forty Fourth Asilomar Conference on
Conference_Location :
Pacific Grove, CA
Print_ISBN :
978-1-4244-9722-5
DOI :
10.1109/ACSSC.2010.5757829