Title :
Making Machines Understand Us in Reverberant Rooms: Robustness Against Reverberation for Automatic Speech Recognition
Author :
Yoshioka, Takuya ; Sehr, Armin ; Delcroix, Marc ; Kinoshita, Keisuke ; Maas, Roland ; Nakatani, Tomohiro ; Kellermann, Walter
Author_Institution :
NTT Commun. Sci. Labs., Kyoto, Japan
Abstract :
Speech recognition technology has left the research laboratory and is increasingly coming into practical use, enabling a wide spectrum of innovative and exciting voice-driven applications that are radically changing our way of accessing digital services and information. Most of today´s applications still require a microphone located near the talker. However, almost all of these applications would benefit from distant-talking speech capturing, where talkers are able to speak at some distance from the microphones without the encumbrance of handheld or body-worn equipment [1]. For example, applications such as meeting speech recognition, automatic annotation of consumer-generated videos, speech-to-speech translation in teleconferencing, and hands-free interfaces for controlling consumer-products, like interactive TV, will greatly benefit from distant-talking operation. Furthermore, for a number of unexplored but important applications, distant microphones are a prerequisite. This means that distant talking speech recognition technology is essential for extending the availability of speech recognizers as well as enhancing the convenience of existing speech recognition applications.
Keywords :
microphones; reverberation; speech recognition; teleconferencing; television; automatic annotation; automatic speech recognition; body-worn equipment; consumer-generated videos; consumer-products; distant microphones; distant-talking operation; distant-talking speech capturing; handheld equipment; hands-free interfaces; interactive TV; meeting speech recognition; reverberant rooms; reverberation; speech recognition technology; speech recognizers; speech-to-speech translation; teleconferencing; Automatic speech recognition; Hidden Markov models; Reverberation; Robustness; Speech recognition;
Journal_Title :
Signal Processing Magazine, IEEE
DOI :
10.1109/MSP.2012.2205029