DocumentCode
2017674
Title
An environment structuring framework to facilitating suitable prior density estimation for MAPLR on robust speech recognition
Author
Tsao, Yu ; Isotani, Ryosuke ; Kawai, Hisashi ; Nakamura, Satoshi
Author_Institution
Spoken Language Commun. Group, Nat. Inst. of Inf. & Commun. Technol., Kyoto, Japan
fYear
2010
fDate
Nov. 29 2010-Dec. 3 2010
Firstpage
29
Lastpage
32
Abstract
In this paper, we propose using an environment structuring framework to facilitate suitable prior density estimation for maximum a posteriori linear regression (MAPLR) under adverse testing conditions. The framework is constructed in a two-stage hierarchical tree structure by performing two algorithms, environment clustering and environment partitioning. The constructed framework has good capability to characterize detailed regional information of various speaker and speaking environments. We intend to incorporate such information into prior density calculation for MAPLR and have designed three types of prior density, namely clustered prior, hierarchical prior, and integrated prior densities. We conduct experiments with the Aurora-2 task. From the testing results, we first observe that MAPLR provides improvements over baseline and maximum likelihood linear regression (MLLR) using either one of the three prior densities. Moreover, we find that by using the integrated prior density that combines the advantages of the other two, MAPLR can give the best performance. When using the best integrated prior density, MAPLR achieves a clear improvement of 10.72% word error rate reduction over the baseline result.
Keywords
maximum likelihood estimation; pattern clustering; regression analysis; speaker recognition; Aurora-2 task; MAPLR; clustering algorithm; environment partitioning; environment structuring framework; maximum a posteriori linear regression; prior density estimation; robust speech recognition; speaker information; two stage hierarchical tree structure; Estimation; Hidden Markov models; IP networks; Speech; Speech recognition; Testing; Training; ASR; MAPLR; SMAPLR; environment clustering; environment partitioning; robust automatic speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Chinese Spoken Language Processing (ISCSLP), 2010 7th International Symposium on
Conference_Location
Tainan
Print_ISBN
978-1-4244-6244-5
Type
conf
DOI
10.1109/ISCSLP.2010.5684880
Filename
5684880
Link To Document