Title :
Histogram-based subband powerwarping and spectral averaging for robust speech recognition under matched and multistyle training
Author :
Harvilla, Mark J. ; Stern, Richard M.
Author_Institution :
Dept. of Electr. & Comput. Eng., Carnegie Mellon Univ., Pittsburgh, PA, USA
Abstract :
This paper describes a new algorithm that increases the robustness of speech recognition systems by matching the power histograms of the input in each frequency band to those obtained over clean training data, and then mixing together the processed and unprocessed spectra. Before calculating prototype histograms over the training data, the power signals in each channel are normalized by the local maximum and minimum of the channel. In contrast, histograms calculated over the testing data are normalized by the global maximum and minimum of the power spectrum. This mode of normalization leads to a significant reduction in noise. Following the histogram-based processing, it is shown that taking a weighted average between the processed and unprocessed power spectra contributes to further gains in recognition accuracy. Results are obtained for multiple speech recognition systems, noise types, and training conditions illustrating the broad utility of this approach.
Keywords :
signal denoising; spectral analysis; speech recognition; global maximum; global minimum; histogram-based processing; histogram-based subband power warping; local maximum; local minimum; matched training; multiple speech recognition system; multistyle training; noise reduction; normalization mode; power histogram matching; power signals; processed spectra; recognition accuracy; robust speech recognition; spectral averaging; training conditions; unprocessed spectra; Histograms; Robustness; Signal to noise ratio; Speech; Speech recognition; Training; Robust speech recognition; histogram matching; matched training; multistyle training; spectral averaging;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location :
Kyoto
Print_ISBN :
978-1-4673-0045-2
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2012.6288967