• DocumentCode
    1872230
  • Title

    Naive Bayes Web Page Classification with HTML Mark-Up Enrichment

  • Author

    Fernández, Víctor Fresno ; Herranz, Soto Montalvo ; Unanue, Raquel Martínez ; Rubio, Arantza Casillas

  • Author_Institution
    ESCET, Univ. Rey Juan Carlos
  • fYear
    2006
  • fDate
    Aug. 2006
  • Firstpage
    48
  • Lastpage
    48
  • Abstract
    In text and Web page classification, Bayesian prior probabilities are usually based on term frequencies, term counts within a page and among all the pages. However, new approaches in Web page representation use HTML mark-up information to find the term relevance in a Web page. This paper presents a naive Bayes Web page classification system for these approaches
  • Keywords
    Bayes methods; Internet; classification; hypermedia markup languages; Bayesian prior probability; HTML mark-up information; HyperText Markup Language; Web page representation; Web page term count; Web page term frequency; Web page term relevance; naive Bayes Web page classification; text classification; Bayesian methods; Frequency; HTML; Information resources; Internet; Search engines; Supervised learning; Telecommunication standards; Text categorization; Web pages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computing in the Global Information Technology, 2006. ICCGI '06. International Multi-Conference on
  • Conference_Location
    Bucharest
  • Print_ISBN
    0-7695-2690-X
  • Electronic_ISBN
    0-7695-2690-X
  • Type

    conf

  • DOI
    10.1109/ICCGI.2006.52
  • Filename
    4124067