• DocumentCode
    3262657
  • Title

    Analyzing web layout structures using graph mining

  • Author

    Lam, Winnie W M ; Chan, Keith C C

  • Author_Institution
    Dept. of Comput., Hong Kong Polytech. Univ., Hong Kong
  • fYear
    2008
  • fDate
    26-28 Aug. 2008
  • Firstpage
    361
  • Lastpage
    366
  • Abstract
    The layout of a Web page commonly offers a limited variety of elements arranged in a number of ways, for example, in navigation panels, or as advertisements, text content, and images. Presumably, the layout of a Web page will influence the way it is used, and this may or may not match the intentions of its designers. In this paper, we propose a novel graph mining algorithm and apply it to study the commercially important problem of how and what specific patterns and features of layout affect advertising click rates. Our proposed algorithm, MIGDAC (mining graph data for classification), applies graph theory and an interestingness measure to discover interesting subgraphs that can allow one class to be both characterized and easily distinguished from other classes. We first extract the information as a block from the Web pages and transform that information into sets of graphs. MIGDAC then uses an interestingness threshold and measure to extract a set of class-specific patterns from the frequent sub-graphs of each class. We then, calculate the weight of evidence to estimate whether the layout of the page will positively or negatively influence the advertisement click-rate on an unseen Web page. The experiment is performed on a set of real Web pages from a local Web site. MIGDAC performs well, greatly improving the accuracy of traditional frequent graph mining algorithm.
  • Keywords
    Web design; data mining; graph theory; pattern classification; MIGDAC; Web layout structures; Web page; advertising click rates; graph mining algorithm; graph theory; local Web site; mining graph data for classification; Advertising; Classification algorithms; Data mining; Databases; Graph theory; Navigation; Pattern analysis; Pattern matching; Web mining; Web pages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Granular Computing, 2008. GrC 2008. IEEE International Conference on
  • Conference_Location
    Hangzhou
  • Print_ISBN
    978-1-4244-2512-9
  • Electronic_ISBN
    978-1-4244-2513-6
  • Type

    conf

  • DOI
    10.1109/GRC.2008.4664741
  • Filename
    4664741