DocumentCode
465980
Title
A Maximum Entropy Approach to Chinese Pin Yin-To-Character Conversion
Author
Wang, Xuan ; Li, Lu ; Yao, Lin ; Anwar, Waqas
Author_Institution
Harbin Inst. of Technol., Shenzhen
Volume
4
fYear
2006
fDate
8-11 Oct. 2006
Firstpage
2956
Lastpage
2959
Abstract
This paper introduces a new approach based upon maximum entropy (ME) frame to solve the Pinyin-to-character (PTC) conversation problem. Mostly there is more than one Chinese characters share the same Pinyin. The task of PTC algorithm is to distinguish such kind ambiguity. PTC can be regards as to classify a Pinyin to a special character according the context which is represented as feature in ME. By taking the advantage of ME, the local and non-local information are included, so the conversation performance is improved. Experiments show that 87% hit rate (without tone) is achieved.
Keywords
maximum entropy methods; speech processing; Chinese Pinyin-to-character conversion; maximum entropy approach; Computational linguistics; Cybernetics; Entropy; Machine learning; Natural language processing; Natural languages; Speech recognition; Statistics; Stochastic processes; Training data;
fLanguage
English
Publisher
ieee
Conference_Titel
Systems, Man and Cybernetics, 2006. SMC '06. IEEE International Conference on
Conference_Location
Taipei
Print_ISBN
1-4244-0099-6
Electronic_ISBN
1-4244-0100-3
Type
conf
DOI
10.1109/ICSMC.2006.384567
Filename
4274331
Link To Document