• DocumentCode
    2199789
  • Title

    Dynamic web page segmentation based on detecting reappearance and layout of tag patterns for small screen devices

  • Author

    Rajkumar, K. ; Kalaivani, V.

  • Author_Institution
    Dept. of CSE (PG), Nat. Eng. Coll., Kovilpatti, India
  • fYear
    2012
  • fDate
    19-21 April 2012
  • Firstpage
    508
  • Lastpage
    513
  • Abstract
    Normally web sites are designed for large screen devices and hence it is not easy to browse these pages with limited user interface and devices such as palm, mobile. Web page segmentation is an important technology for both search engine and web browser on mobile device. Web page segmentation is a task that breaks down the structure of web page into logical blocks which is an important step for identifying informative blocks for efficient information extraction and convenient display on the devices with small sized screens. Previous repetition based segmentation method is not suitable for segmenting blocks, when there is no reappearance tags in the web pages. In order to improve the segmentation accuracy, a new method of segmentation is introduced (DWS) which segments web pages based on either reappearance based scheme, by recognizing reappearance tag patterns from the DOM tree structure of a web page. Based on the detection of tag patterns, it generates implicit nodes to segment the nested block correctly nor it will segment pages based on web layout information such as TABLE>;, DIV>; and FRAME>; tags depends on key pattern in the web page. If it contains reappearance tag in tag pattern means, it will segment based on reappearance based segmentation. Otherwise it will segment based on web layout information. From that segmented block hyperlink is displayed on the mobile first and then user select hyperlinks based on his area of interest. The interested information alone is displayed to the user.
  • Keywords
    Web sites; document handling; mobile computing; online front-ends; search engines; DOM tree structure; DWS; Web browser; Web sites; block hyperlink segmentation; document object model; dynamic Web page segmentation; information extraction; mobile device; nested block segmentation; reappearance based scheme; reappearance detection; reappearance tag pattern recognition; search engine; small screen devices; tag pattern layout; Bandwidth; Feature extraction; HTML; Heuristic algorithms; Layout; Mobile handsets; Web pages; DOM(Document object Model); DWS(Dynamic web page segmentation); Reappearance key Patterns; Web page Segmentation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Recent Trends In Information Technology (ICRTIT), 2012 International Conference on
  • Conference_Location
    Chennai, Tamil Nadu
  • Print_ISBN
    978-1-4673-1599-9
  • Type

    conf

  • DOI
    10.1109/ICRTIT.2012.6206790
  • Filename
    6206790