DocumentCode
50805
Title
Automatic Complexity Control of Generalized Variable Parameter HMMs for Noise Robust Speech Recognition
Author
Rongfeng Su ; Xunying Liu ; Lan Wang
Author_Institution
Shenzhen Inst. of Adv. Technol., Chinese Univ. of Hong Kong, Shenzhen, China
Volume
23
Issue
1
fYear
2015
fDate
Jan. 2015
Firstpage
102
Lastpage
114
Abstract
An important part of the acoustic modelling problem for automatic speech recognition (ASR) systems is to handle the mismatch against a target environment created by time-varying external factors such as ambient noise. One possible solution to this problem is to introduce controllability to the underlying acoustic model to allow an instantaneous adaptation to the underlying noise condition. Along this line, the continuous trajectory of optimal, well matched model parameters against the varying noise can be explicitly modelled using, for example, generalized variable parameter HMMs (GVP-HMM). In order to improve the generalization and computational efficiency of conventional GVP-HMMs, this paper investigates a novel model complexity control method for GVP-HMMs. The optimal polynomial degrees of Gaussian mean, variance and model space linear transform trajectories are automatically determined at local level. Significant error rate reductions of 20% and 28% relative were obtained over the multi-style training baseline systems on Aurora 2 and a medium vocabulary Mandarin Chinese speech recognition task respectively. Consistent performance improvements and model size compression of 60% relative were also obtained over the baseline GVP-HMM systems using a uniformly assigned polynomial degree.
Keywords
Gaussian processes; acoustic noise; acoustic signal processing; computational complexity; controllability; error statistics; hidden Markov models; polynomials; speech recognition; ASR; Aurora 2; GVP-HMM system; Gaussian mean; Mandarin Chinese speech recognition; acoustic modelling problem; automatic model complexity control; automatic speech recognition; continuous trajectory; controllability; error rate reduction; generalized variable parameter; matched model parameters; model space linear transform trajectory; multistyle training baseline system; noise robust speech recognition; optimal polynomial degrees; time-varying external factors; variance; Complexity theory; Hidden Markov models; Mathematical model; Noise; Polynomials; Trajectory; Complexity control; generalized variable parameter HMMs; robust speech recognition; variable noise;
fLanguage
English
Journal_Title
Audio, Speech, and Language Processing, IEEE/ACM Transactions on
Publisher
ieee
ISSN
2329-9290
Type
jour
DOI
10.1109/TASLP.2014.2372901
Filename
6963456
Link To Document