DocumentCode :
1904161
Title :
Automatic Classification of Tibetan Web Pages
Author :
Xu, Guixian ; Xiang, Chuncheng ; Gao, Xu ; Zhao, Xiaobing ; Yang, Guosheng
Author_Institution :
Coll. of Inf. Eng., Minzu Univ. of China, Beijing, China
Volume :
3
fYear :
2012
fDate :
23-25 March 2012
Firstpage :
423
Lastpage :
426
Abstract :
A classification approach for Tibetan web pages is introduced in this paper. It takes advantage of the class feature dictionary and Rocchio classification algorithm to classify the Tibetan web pages into the predefined classes rapidly and accurately. The experimental results present that the approach has better classification accuracy for Tibetan web pages classification. It is useful and helpful for the construction of the statistical and rule-based classification of Tibetan texts as well as construction of high-quality Tibetan corpus.
Keywords :
Web sites; natural language processing; pattern classification; statistical analysis; text analysis; Rocchio classification algorithm; Tibetan texts; automatic Tibetan Web page classification; class feature dictionary; high-quality Tibetan corpus; rule-based classification; statistical classification; Classification algorithms; Dictionaries; Information processing; Kernel; Machine learning; Text categorization; Web pages; Classification of Web Pages; Text classification; Tibetan Information Processing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Science and Electronics Engineering (ICCSEE), 2012 International Conference on
Conference_Location :
Hangzhou
Print_ISBN :
978-1-4673-0689-8
Type :
conf
DOI :
10.1109/ICCSEE.2012.177
Filename :
6188269
Link To Document :
بازگشت