• DocumentCode
    1783881
  • Title

    Boosted Hybrid DNN/HMM System Based on Correlation-Generated Targets

  • Author

    Mengzhe Chen ; Qingqing Zhang ; Jielin Pan ; Yonghong Yan

  • Author_Institution
    Key Lab. of Speech Acoust. & Content Understanding, Inst. of Acoust., Beijing, China
  • fYear
    2014
  • fDate
    27-29 Aug. 2014
  • Firstpage
    590
  • Lastpage
    593
  • Abstract
    In current DNN/HMM hybrid systems, the DNN models are trained by the 1-of-V targets which are obtained by the Viterbi-based forced-alignment. The states are viewed as unrelated and isolated. In fact, some phonemes are acoustically similar. Especially for Chinese, as a tonal language, its number of similar pairs is quadrupled. To add the similarity information between states into the model training, the correlation-generated targets are investigated in DNN modeling. For each frame, besides the target state from the forced-alignment, other states which are similar to this state will be assigned to nonzero values. The similarity degrees between the states are measured through calculating the correlation. In the paper, different methods of generating correlation matrix were investigated, and details of the implementation with the correlation matrix were described. On the task for Mandarin conversational speech recognition in customer-service domain, the experiments showed that the hybrid DNN/HMM System based on correlation-generated targets achieved consistent improvements with different amounts of training data.
  • Keywords
    hidden Markov models; speech recognition; Chinese; DNN models; Mandarin conversational speech recognition; Viterbi-based forced-alignment; boosted hybrid DNN-HMM system; correlation-generated targets; customer-service domain; phonemes; tonal language; Acoustics; Correlation; Hidden Markov models; Speech; Speech recognition; Training; Training data; Mandarin speech recognition; correlation-generated targets; hybrid DNN/HMM system;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP), 2014 Tenth International Conference on
  • Conference_Location
    Kitakyushu
  • Print_ISBN
    978-1-4799-5389-9
  • Type

    conf

  • DOI
    10.1109/IIH-MSP.2014.153
  • Filename
    6998398