DocumentCode :
3222391
Title :
Random forest classifier for multi-category classification of web pages
Author :
Win Thanda Aung ; Khin Hay Mar Saw Hla
Author_Institution :
Univ. of Comput. Studies, Yangon, Myanmar
fYear :
2009
fDate :
7-11 Dec. 2009
Firstpage :
372
Lastpage :
376
Abstract :
Web page classification is the automated assigning of predefined subject category to the document. Automatic Web page classification is one of the most essential techniques for Web mining given that the Web is a huge repository of various information including images, videos etc. And there is a need for categorization Web pages to satisfy user needs. The classification of Web pages into each category exclusively relies on man power which cost much time and effort. To alleviate this manually classification problem, more researchers focus on the issue of Web pages classification technology. In this paper, we proposed Random Forest Classifier (RF) based on random forest method for multi-category Web page classification. The proposed RF classifier can classify Web pages efficiently according to their corresponding class without using other feature selection methods. We compared the accuracy of the proposed approach to decision tree classifier using in the same Yahoo Web pages. The experiments have shown that the proposed approach is suitable for the multi-category Web page classification.
Keywords :
Internet; Web sites; classification; decision trees; information retrieval; random processes; Web mining; Yahoo Web pages; automatic Web page classification; decision tree classifier; document predefined subject category; feature selection methods; multicategory Web page classification; multicategory classification; random forest classifier; Classification tree analysis; Decision trees; Machine learning algorithms; Nearest neighbor searches; Neural networks; Radio frequency; Support vector machine classification; Support vector machines; Training data; Web pages; multi-category web page 11classification; random forest classifier;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Services Computing Conference, 2009. APSCC 2009. IEEE Asia-Pacific
Conference_Location :
Singapore
Print_ISBN :
978-1-4244-5338-2
Electronic_ISBN :
978-1-4244-5336-8
Type :
conf
DOI :
10.1109/APSCC.2009.5394100
Filename :
5394100
Link To Document :
بازگشت