• DocumentCode
    387538
  • Title

    An approach to machine learning of Chinese Pinyin-to-character conversion for small-memory application

  • Author

    Liu, Bing-quan ; Wang, Xiao-long

  • Author_Institution
    Sch. of Comput. Sci. & Technol., Harbin Inst. of Technol., China
  • Volume
    3
  • fYear
    2002
  • fDate
    2002
  • Firstpage
    1287
  • Abstract
    Chinese Pinyin-to-character conversion is used in Chinese character input through keyboard and Chinese speech recognition. The key of this kind of system is machine learning that fits system for specific user. In this paper, an effective approach of machine learning of Chinese Pinyin-to-character conversion for small-memory application is presented. The approach is based on iterative new word identification and word frequency increasing that results in more accurate segmentation of Chinese character gradually and satisfy the need of user finally. Applying proposed machine learning to Chinese character input system through keyboard improves accuracy of Pinyin-to-character conversion from 90% up to 98%. Such a system can run in very small memory (limited in 120 K) and satisfy the need of small-memory platform. With rapid development of digital appliances such as PDA, mobile telephone, intelligent refrigerator and etc., and with development of embedded operating system, Pinyin-to-character conversion presented in this paper has found its new application.
  • Keywords
    character recognition; learning (artificial intelligence); speech recognition; Chinese character; Chinese character input through keyboard; Chinese speech recognition; iterative new word identification; machine learning; pinyin-to-character conversion; small-memory application; Frequency; Home appliances; Iterative methods; Keyboards; Learning systems; Machine learning; Operating systems; Refrigeration; Speech recognition; Telephony;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics, 2002. Proceedings. 2002 International Conference on
  • Print_ISBN
    0-7803-7508-4
  • Type

    conf

  • DOI
    10.1109/ICMLC.2002.1167411
  • Filename
    1167411