• DocumentCode
    1904161
  • Title

    Automatic Classification of Tibetan Web Pages

  • Author

    Xu, Guixian ; Xiang, Chuncheng ; Gao, Xu ; Zhao, Xiaobing ; Yang, Guosheng

  • Author_Institution
    Coll. of Inf. Eng., Minzu Univ. of China, Beijing, China
  • Volume
    3
  • fYear
    2012
  • fDate
    23-25 March 2012
  • Firstpage
    423
  • Lastpage
    426
  • Abstract
    A classification approach for Tibetan web pages is introduced in this paper. It takes advantage of the class feature dictionary and Rocchio classification algorithm to classify the Tibetan web pages into the predefined classes rapidly and accurately. The experimental results present that the approach has better classification accuracy for Tibetan web pages classification. It is useful and helpful for the construction of the statistical and rule-based classification of Tibetan texts as well as construction of high-quality Tibetan corpus.
  • Keywords
    Web sites; natural language processing; pattern classification; statistical analysis; text analysis; Rocchio classification algorithm; Tibetan texts; automatic Tibetan Web page classification; class feature dictionary; high-quality Tibetan corpus; rule-based classification; statistical classification; Classification algorithms; Dictionaries; Information processing; Kernel; Machine learning; Text categorization; Web pages; Classification of Web Pages; Text classification; Tibetan Information Processing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Science and Electronics Engineering (ICCSEE), 2012 International Conference on
  • Conference_Location
    Hangzhou
  • Print_ISBN
    978-1-4673-0689-8
  • Type

    conf

  • DOI
    10.1109/ICCSEE.2012.177
  • Filename
    6188269