DocumentCode :
3285164
Title :
Structure and Content Based Blog Pages Identification
Author :
Yu, Feng ; Zheng, Dequan ; Zhao, Tiejun ; Cheng, Xiao
Author_Institution :
Sch. of Comput. & Inf. Eng., Harbin Univ. of Commerce, Harbin
Volume :
2
fYear :
2008
fDate :
18-20 Oct. 2008
Firstpage :
213
Lastpage :
217
Abstract :
Blog is becoming more and more popular with the rapid development of Internet. It needs to find an automatic way to distinguish the blog pages from ordinary Web pages for the content extraction of blog pages and the blog community discovered. Some basic concepts and ideas in the area of blog was described in this paper, and a method on the blog pages identification is proposed, which is based on the blog pages structure and blog content. The experimentation shows that a high result can be achieved in precision.
Keywords :
Internet; Web sites; Internet; Web pages; blog pages identification; Business; Fuzzy systems; Information services; Internet; Knowledge engineering; Navigation; Support vector machine classification; Support vector machines; Web pages; Web sites; Blog; Blog Structure and Content; Broad Blog; Narrow Blog;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems and Knowledge Discovery, 2008. FSKD '08. Fifth International Conference on
Conference_Location :
Shandong
Print_ISBN :
978-0-7695-3305-6
Type :
conf
DOI :
10.1109/FSKD.2008.371
Filename :
4666110
Link To Document :
بازگشت