• DocumentCode
    2017674
  • Title

    An environment structuring framework to facilitating suitable prior density estimation for MAPLR on robust speech recognition

  • Author

    Tsao, Yu ; Isotani, Ryosuke ; Kawai, Hisashi ; Nakamura, Satoshi

  • Author_Institution
    Spoken Language Commun. Group, Nat. Inst. of Inf. & Commun. Technol., Kyoto, Japan
  • fYear
    2010
  • fDate
    Nov. 29 2010-Dec. 3 2010
  • Firstpage
    29
  • Lastpage
    32
  • Abstract
    In this paper, we propose using an environment structuring framework to facilitate suitable prior density estimation for maximum a posteriori linear regression (MAPLR) under adverse testing conditions. The framework is constructed in a two-stage hierarchical tree structure by performing two algorithms, environment clustering and environment partitioning. The constructed framework has good capability to characterize detailed regional information of various speaker and speaking environments. We intend to incorporate such information into prior density calculation for MAPLR and have designed three types of prior density, namely clustered prior, hierarchical prior, and integrated prior densities. We conduct experiments with the Aurora-2 task. From the testing results, we first observe that MAPLR provides improvements over baseline and maximum likelihood linear regression (MLLR) using either one of the three prior densities. Moreover, we find that by using the integrated prior density that combines the advantages of the other two, MAPLR can give the best performance. When using the best integrated prior density, MAPLR achieves a clear improvement of 10.72% word error rate reduction over the baseline result.
  • Keywords
    maximum likelihood estimation; pattern clustering; regression analysis; speaker recognition; Aurora-2 task; MAPLR; clustering algorithm; environment partitioning; environment structuring framework; maximum a posteriori linear regression; prior density estimation; robust speech recognition; speaker information; two stage hierarchical tree structure; Estimation; Hidden Markov models; IP networks; Speech; Speech recognition; Testing; Training; ASR; MAPLR; SMAPLR; environment clustering; environment partitioning; robust automatic speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Chinese Spoken Language Processing (ISCSLP), 2010 7th International Symposium on
  • Conference_Location
    Tainan
  • Print_ISBN
    978-1-4244-6244-5
  • Type

    conf

  • DOI
    10.1109/ISCSLP.2010.5684880
  • Filename
    5684880