DocumentCode
568071
Title
Improving SVM on web content classification by document formulation
Author
Xia, Tian ; Chai, Yanmei ; Wang, Tong
Author_Institution
Dept. of Comput. & Inf. Sci., Shanghai Second Polytech. Univ., Shanghai, China
fYear
2012
fDate
14-17 July 2012
Firstpage
110
Lastpage
113
Abstract
Web contents are going overwhelming today. The numerous online documents, webpages, e-books, etc. are much useful but obtaining them is also time-consuming. Text categorization is one of the solutions to the issue. For all text categorization method, Support Vector Machines (SVM) is one of the most acceptable one. However, to perform more efficiently on webpages, it is necessary to add improvements on it. For webpages, the document title is meaningful as it is usually carefully created by editors and always shows the main content of the webpage. In this paper, an improvement of Support Vector Machine is proposed. The Document Representation for SVM emphases the features in documents´ title which is always popular in webpages and obviously contains essential contextual information for the documents.
Keywords
Internet; pattern classification; support vector machines; text analysis; SVM; Web content classification; Webpages; contextual information; document formulation; document representation; document title; e-books; online documents; support vector machines; text categorization method; Educational institutions; Kernel; Support vector machine classification; Text categorization; Training; Vectors; Natural language Processing; SVM; Support Vector Machines; Text Classification; Title Vector;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Science & Education (ICCSE), 2012 7th International Conference on
Conference_Location
Melbourne, VIC
Print_ISBN
978-1-4673-0241-8
Type
conf
DOI
10.1109/ICCSE.2012.6295037
Filename
6295037
Link To Document