Title :
Enhanced voice activity detection using acoustic event detection and classification
Author :
Cho, Namgook ; Kim, Eun-Kyoung
Author_Institution :
Digital Media & Commun. R&D Center, Samsung Electron., Suwon, South Korea
fDate :
2/1/2011 12:00:00 AM
Abstract :
We examine user-friendly voice interface that requires the hands-free speech acquisition in the continuously listening environment. The traditional voice activity detection (VAD) algorithms cannot successfully identify potential acoustic event sounds from speech. This makes the speech recognition system frequently or incorrectly activated. In this paper, we propose a novel voice activity detection technique that consists of two major modules: 1) classification and 2) detection module. In the classification module, we label the successive audio segments based on the training models. Then, in the detection module, we remove the acoustic event sounds and make decision of the explicit utterance boundary from the input audio stream. As a result, the proposed technique enables the efficient operation of speech recognition in the continuously listening environment without any touch and/or key input. Experiments in a real-world environment and performance comparison with state-of-the-art techniques are conducted to demonstrate the effectiveness of the proposed technique.
Keywords :
speech recognition; acoustic event detection; audio segmentation; audio stream; enhanced voice activity detection; hand free speech acquisition; speech recognition; user friendly voice interface; Acoustics; Classification algorithms; Indexes; Speech; Speech recognition; Support vector machines; Training; Acoustic event detection and classification; continuously listening environment; voice activity detection; voice interface;
Journal_Title :
Consumer Electronics, IEEE Transactions on
DOI :
10.1109/TCE.2011.5735502