DocumentCode
1904161
Title
Automatic Classification of Tibetan Web Pages
Author
Xu, Guixian ; Xiang, Chuncheng ; Gao, Xu ; Zhao, Xiaobing ; Yang, Guosheng
Author_Institution
Coll. of Inf. Eng., Minzu Univ. of China, Beijing, China
Volume
3
fYear
2012
fDate
23-25 March 2012
Firstpage
423
Lastpage
426
Abstract
A classification approach for Tibetan web pages is introduced in this paper. It takes advantage of the class feature dictionary and Rocchio classification algorithm to classify the Tibetan web pages into the predefined classes rapidly and accurately. The experimental results present that the approach has better classification accuracy for Tibetan web pages classification. It is useful and helpful for the construction of the statistical and rule-based classification of Tibetan texts as well as construction of high-quality Tibetan corpus.
Keywords
Web sites; natural language processing; pattern classification; statistical analysis; text analysis; Rocchio classification algorithm; Tibetan texts; automatic Tibetan Web page classification; class feature dictionary; high-quality Tibetan corpus; rule-based classification; statistical classification; Classification algorithms; Dictionaries; Information processing; Kernel; Machine learning; Text categorization; Web pages; Classification of Web Pages; Text classification; Tibetan Information Processing;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Science and Electronics Engineering (ICCSEE), 2012 International Conference on
Conference_Location
Hangzhou
Print_ISBN
978-1-4673-0689-8
Type
conf
DOI
10.1109/ICCSEE.2012.177
Filename
6188269
Link To Document