• DocumentCode
    465980
  • Title

    A Maximum Entropy Approach to Chinese Pin Yin-To-Character Conversion

  • Author

    Wang, Xuan ; Li, Lu ; Yao, Lin ; Anwar, Waqas

  • Author_Institution
    Harbin Inst. of Technol., Shenzhen
  • Volume
    4
  • fYear
    2006
  • fDate
    8-11 Oct. 2006
  • Firstpage
    2956
  • Lastpage
    2959
  • Abstract
    This paper introduces a new approach based upon maximum entropy (ME) frame to solve the Pinyin-to-character (PTC) conversation problem. Mostly there is more than one Chinese characters share the same Pinyin. The task of PTC algorithm is to distinguish such kind ambiguity. PTC can be regards as to classify a Pinyin to a special character according the context which is represented as feature in ME. By taking the advantage of ME, the local and non-local information are included, so the conversation performance is improved. Experiments show that 87% hit rate (without tone) is achieved.
  • Keywords
    maximum entropy methods; speech processing; Chinese Pinyin-to-character conversion; maximum entropy approach; Computational linguistics; Cybernetics; Entropy; Machine learning; Natural language processing; Natural languages; Speech recognition; Statistics; Stochastic processes; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Systems, Man and Cybernetics, 2006. SMC '06. IEEE International Conference on
  • Conference_Location
    Taipei
  • Print_ISBN
    1-4244-0099-6
  • Electronic_ISBN
    1-4244-0100-3
  • Type

    conf

  • DOI
    10.1109/ICSMC.2006.384567
  • Filename
    4274331