• DocumentCode
    3410767
  • Title

    Discriminative incorporation of explicitly trained tone models into lattice based rescoring for Mandarin speech recognition

  • Author

    Huang, Hao ; Zhu, Jie

  • Author_Institution
    Dept. of Electron. Eng., Shanghai Jiao Tong Univ., Shanghai
  • fYear
    2008
  • fDate
    March 31 2008-April 4 2008
  • Firstpage
    1541
  • Lastpage
    1544
  • Abstract
    Explicit tone modeling has been widely discussed in recent Mandarin speech recognition research. In this paper, a discriminative method of incorporating explicitly trained tone models into lattice based rescoring is proposed. The method is to use discriminative trained model weights to scale the acoustic model and tone model distributions. The weights are trained by the minimum phone error using the extended Baum Welch algorithm. To take into account different phonetic contexts, various model weighting schemes are evaluated. A smoothing technique is introduced to make model weight training more robust to over fitting. The proposed method is evaluated on tonal syllable output speech recognition tasks on a Mandarin LVCSR database. Results show the proposed method has achieved significant error reduction than traditional global weight approach. Comparison with the traditional embedded tone modeling is also made, which shows the importance of the proposed method when explicit tone modeling approach is applied.
  • Keywords
    natural language processing; speech processing; speech recognition; Mandarin speech recognition; explicitly trained tone models; lattice based rescoring; phonetic contexts; tonal syllable output speech recognition tasks; Cepstral analysis; Context modeling; Databases; Lattices; Mel frequency cepstral coefficient; Predictive models; Robustness; Smoothing methods; Speech recognition; Vocabulary; Mandarin speech recognition; discriminative training; explicit tone model incorporation; minimum phone error;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
  • Conference_Location
    Las Vegas, NV
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-1483-3
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2008.4517916
  • Filename
    4517916