DocumentCode :
568071
Title :
Improving SVM on web content classification by document formulation
Author :
Xia, Tian ; Chai, Yanmei ; Wang, Tong
Author_Institution :
Dept. of Comput. & Inf. Sci., Shanghai Second Polytech. Univ., Shanghai, China
fYear :
2012
fDate :
14-17 July 2012
Firstpage :
110
Lastpage :
113
Abstract :
Web contents are going overwhelming today. The numerous online documents, webpages, e-books, etc. are much useful but obtaining them is also time-consuming. Text categorization is one of the solutions to the issue. For all text categorization method, Support Vector Machines (SVM) is one of the most acceptable one. However, to perform more efficiently on webpages, it is necessary to add improvements on it. For webpages, the document title is meaningful as it is usually carefully created by editors and always shows the main content of the webpage. In this paper, an improvement of Support Vector Machine is proposed. The Document Representation for SVM emphases the features in documents´ title which is always popular in webpages and obviously contains essential contextual information for the documents.
Keywords :
Internet; pattern classification; support vector machines; text analysis; SVM; Web content classification; Webpages; contextual information; document formulation; document representation; document title; e-books; online documents; support vector machines; text categorization method; Educational institutions; Kernel; Support vector machine classification; Text categorization; Training; Vectors; Natural language Processing; SVM; Support Vector Machines; Text Classification; Title Vector;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Science & Education (ICCSE), 2012 7th International Conference on
Conference_Location :
Melbourne, VIC
Print_ISBN :
978-1-4673-0241-8
Type :
conf
DOI :
10.1109/ICCSE.2012.6295037
Filename :
6295037
Link To Document :
بازگشت