• DocumentCode
    2870027
  • Title

    An Examination of Genre Attributes for Web Page Classification

  • Author

    Dong, Lei ; Watters, Carolyn ; Duffy, Jack ; Shepherd, Michael

  • Author_Institution
    Dalhousie Univ., Halifax
  • fYear
    2008
  • fDate
    7-10 Jan. 2008
  • Firstpage
    133
  • Lastpage
    133
  • Abstract
    In this paper, we describe a set of experiments to examine the effect of various attributes of web genre on the automatic identification of the genre of web pages. Four different genres are used in the data set, namely, FAQ, News, E-Shopping and Personal Home Pages. The effects of the number of features used to represent the web pages (5, 20, or 100) as well as the types of attributes, <content, form, functionality>, singly and in various combinations are examined. The results indicate that fewer features produce better precision but more features produce better recall, and that attributes in combinations will always perform better than single attributes.
  • Keywords
    Internet; classification; FAQ data set; Web page classification; e-shopping; genre attributes; news data set; personal home pages; Computer science; Graphics; Lifting equipment; Machine learning; Navigation; Search engines; Web pages; Web search; Web sites;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Hawaii International Conference on System Sciences, Proceedings of the 41st Annual
  • Conference_Location
    Waikoloa, HI
  • ISSN
    1530-1605
  • Type

    conf

  • DOI
    10.1109/HICSS.2008.53
  • Filename
    4438836