Title : 
Naive Bayes Web Page Classification with HTML Mark-Up Enrichment
         
        
            Author : 
Fernández, Víctor Fresno ; Herranz, Soto Montalvo ; Unanue, Raquel Martínez ; Rubio, Arantza Casillas
         
        
            Author_Institution : 
ESCET, Univ. Rey Juan Carlos
         
        
        
        
        
        
            Abstract : 
In text and Web page classification, Bayesian prior probabilities are usually based on term frequencies, term counts within a page and among all the pages. However, new approaches in Web page representation use HTML mark-up information to find the term relevance in a Web page. This paper presents a naive Bayes Web page classification system for these approaches
         
        
            Keywords : 
Bayes methods; Internet; classification; hypermedia markup languages; Bayesian prior probability; HTML mark-up information; HyperText Markup Language; Web page representation; Web page term count; Web page term frequency; Web page term relevance; naive Bayes Web page classification; text classification; Bayesian methods; Frequency; HTML; Information resources; Internet; Search engines; Supervised learning; Telecommunication standards; Text categorization; Web pages;
         
        
        
        
            Conference_Titel : 
Computing in the Global Information Technology, 2006. ICCGI '06. International Multi-Conference on
         
        
            Conference_Location : 
Bucharest
         
        
            Print_ISBN : 
0-7695-2690-X
         
        
            Electronic_ISBN : 
0-7695-2690-X
         
        
        
            DOI : 
10.1109/ICCGI.2006.52