DocumentCode
1872230
Title
Naive Bayes Web Page Classification with HTML Mark-Up Enrichment
Author
Fernández, Víctor Fresno ; Herranz, Soto Montalvo ; Unanue, Raquel Martínez ; Rubio, Arantza Casillas
Author_Institution
ESCET, Univ. Rey Juan Carlos
fYear
2006
fDate
Aug. 2006
Firstpage
48
Lastpage
48
Abstract
In text and Web page classification, Bayesian prior probabilities are usually based on term frequencies, term counts within a page and among all the pages. However, new approaches in Web page representation use HTML mark-up information to find the term relevance in a Web page. This paper presents a naive Bayes Web page classification system for these approaches
Keywords
Bayes methods; Internet; classification; hypermedia markup languages; Bayesian prior probability; HTML mark-up information; HyperText Markup Language; Web page representation; Web page term count; Web page term frequency; Web page term relevance; naive Bayes Web page classification; text classification; Bayesian methods; Frequency; HTML; Information resources; Internet; Search engines; Supervised learning; Telecommunication standards; Text categorization; Web pages;
fLanguage
English
Publisher
ieee
Conference_Titel
Computing in the Global Information Technology, 2006. ICCGI '06. International Multi-Conference on
Conference_Location
Bucharest
Print_ISBN
0-7695-2690-X
Electronic_ISBN
0-7695-2690-X
Type
conf
DOI
10.1109/ICCGI.2006.52
Filename
4124067
Link To Document