• DocumentCode
    417281
  • Title

    Decision tree based tone modeling for Chinese speech recognition

  • Author

    Wong, Pui-Fung ; Siu, Man-Hung

  • Author_Institution
    Dept. of Electr. & Electron. Eng., Hong Kong Univ. of Sci. & Technol., China
  • Volume
    1
  • fYear
    2004
  • fDate
    17-21 May 2004
  • Abstract
    Because of the tonal nature of Chinese languages, correct recognition of lexical tones is necessary for Chinese speech recognition. In order to incorporate tone information into Chinese speech recognition, three issues need to be addressed: (i) the representation of the syllable pitch contour as well as the tone contour; (ii) the lexical tone probability estimation; and (iii) the integration of tone probabilities into the Viterbi recognition process. In this paper we propose a robust polynomial segmental representation of the pitch contour coupled with a decision tree based tone classifier. We also propose a novel approach of integrating the decision tree tone classifier directly into a single pass recognition process. The proposed approaches were evaluated on tasks of tone classification and tonal-syllable recognition. In regard to tone classification, the robust decision tree gave a tone classification-accuracy of 89% for isolated syllables and 71.2% for the continuous speech. Moreover, by incorporating the decision tree tone classifier into the Viterbi search, the tonal-syllable recognition error rate in continuous speech was reduced by 13.54%.
  • Keywords
    decision trees; maximum likelihood estimation; pattern classification; probability; signal representation; speech processing; speech recognition; tree searching; Chinese speech recognition; Viterbi search; continuous speech; decision tree; lexical tone probability estimation; lexical tone recognition; robust polynomial segmental representation; single pass recognition process; syllable pitch contour representation; tonal languages; tonal-syllable recognition; tone classifier; tone contour; tone modeling; Classification tree analysis; Decision trees; Estimation error; Hidden Markov models; Natural languages; Polynomials; Robustness; Shape; Speech recognition; Viterbi algorithm;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-8484-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.2004.1326133
  • Filename
    1326133