• DocumentCode
    3253044
  • Title

    Algorithm for detecting dynamic webpage and its importance

  • Author

    Sultania, A.K.

  • Author_Institution
    Freescale Semicond. Pvt Ltd., Noida, India
  • fYear
    2012
  • fDate
    21-22 Dec. 2012
  • Firstpage
    257
  • Lastpage
    259
  • Abstract
    During web search using crawling, indexing, relevance it is found that there exist many duplicate web-pages with different URLs, these URLs are normalized when used by crawler. Many web-pages are found to be dynamic, for which different web contents are found with the same URL, during different instances of searches. In this paper, we discuss about the necessity to detect these dynamic web-pages and propose an algorithm to identify this dynamism. The normalization of URLs can be done using various methods explained in [1], [2] & [7], or using the DUST algorithm [3] but it is necessary first to identify the dynamic web-page before normalization. After implementing the proposed algorithm with DUST rule it is expected that the detection rate of dynamic web-pages improves, resulting in reduction of the time spent for crawling, indexing etc.
  • Keywords
    Web sites; indexing; information retrieval; DUST algorithm; URL; Web search; crawling; duplicate Web-pages; dynamic Webpage detection; indexing; Conferences; Heuristic algorithms; Indexing; Radar tracking; Search engines; Web search; World Wide Web; Search engine; URL normalization; Webpage de-duplication; duplicate detection; dynamic webpage;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Radar, Communication and Computing (ICRCC), 2012 International Conference on
  • Conference_Location
    Tiruvannamalai
  • Print_ISBN
    978-1-4673-2756-5
  • Type

    conf

  • DOI
    10.1109/ICRCC.2012.6450590
  • Filename
    6450590