Title :
Robust Large Vocabulary Continuous Speech Recognition using Polynomial Segment Model with Unsupervised Adaptation
Author :
Siu, Man-Hung ; Yeung, Siu-Kei Au
Author_Institution :
Dept. of Electr. & Electron. Eng., Hong Kong Univ. of Sci. & Technol., Kowloon
Abstract :
Robustness has been an important issue for applying speech technologies to real applications. While the polynomial segment models (PSMs) have been shown to outperform HMM under the clean environment, the segmental likelihood evaluation may make the PSM distributions sharper and may adversely affect their performance in mis-matched conditions. In this paper, we explore the robustness properties of the PSM under noisy and channel mis-match conditions. In addition, unsupervised adaptation techniques have been shown to work well for environmental adaptation even with small amount of adaptation data. Thus, it is interesting to compare the PSMs´ and the HMMs´ performances after applying two types of unsupervised adaptation: the maximum likelihood linear regression (MLLR) and the reference speaker weighting (RSW). Experiments were performed on the Aurora 4 corpus under both clean and multi-conditional training. Our results show that even under noisy and mis-match conditions, the PSMs performed well compared to the HMMs both before and after environmental adaptation. Using the best lattice, the RSW adapted PSM gave word error rates of 26.5% and 21.3% for clean and multi-conditional training respectively which were approximately 24% better than the unadapted HMM
Keywords :
error statistics; hidden Markov models; maximum likelihood estimation; regression analysis; speech recognition; Aurora 4 corpus; HMM; channel mismatch condition; maximum likelihood linear regression; multi-conditional training; noisy mismatch condition; polynomial segment model; reference speaker weighting; robust large vocabulary continuous speech recognition; robustness properties; segmental likelihood evaluation; speech technologies; unsupervised adaptation techniques; word error rates; Adaptation model; Error analysis; Hidden Markov models; Lattices; Maximum likelihood linear regression; Polynomials; Robustness; Speech recognition; Vocabulary; Working environment noise;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on
Conference_Location :
Toulouse
Print_ISBN :
1-4244-0469-X
DOI :
10.1109/ICASSP.2006.1660054