Title :
Noise Condition-Dependent Training Based on Noise Classification and SNR Estimation
Author :
Xu, Haitian ; Dalsgaard, Paul ; Tan, Zheng-Hua ; Lindberg, Børge
Author_Institution :
Toshiba Reseach Eur., Ltd., Cambridge
Abstract :
Condition-dependent training strategy divides a training database into a number of clusters, each corresponding to a noise condition and subsequently trains a hidden Markov model (HMM) set for each cluster. This paper investigates and compares a number of condition-dependent training strategies in order to achieve a better understanding of the effects on automatic speech recogntion (ASR) performance as caused by a splitting of the training databases. Also, the relationship between mismatches in signal-to-noise ratio (SNR) is analyzed. The results show that a splitting of the training material in terms of both noise type and SNR value is advantageous compared to previously used methods, and that training of only a limited number of HMM sets is sufficient for each noise type for robustly handling of SNR mismatches. This leads to the introduction of an SNR and noise classification-based training strategy (SNT-SNC). Better ASR performance is obtained on test material containing data from known noise types as compared to either multicondition training or noise-type dependent training strategies. The computational complexity of the SNT-SNC framework is kept low by choosing only one HMM set for recognition. The HMM set is chosen on the basis of results from noise classification and SNR value estimations. However, compared to other strategies, the SNT-SNC framework shows lower performance for unknown noise types. This problem is partly overcome by introducing a number of model and feature domain techniques. Experiments using both artificially corrupted and real-world noisy speech databases are conducted and demonstrate the effectiveness of these methods.
Keywords :
computational complexity; hidden Markov models; signal classification; speech recognition; automatic speech recogntion; computational complexity; feature domain technique; hidden Markov model; noise classification; noise condition-dependent training; Automatic speech recognition; Databases; Hidden Markov models; Noise robustness; Signal analysis; Signal to noise ratio; Speech enhancement; Speech processing; Speech recognition; Working environment noise; Condition-dependent training; noise classification; robust speech recognition; robustness to unknown noise; signal-to-noise ratio (SNR) estimation;
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
DOI :
10.1109/TASL.2007.906188