• DocumentCode
    3121453
  • Title

    AJAX Crawl: Making AJAX Applications Searchable

  • Author

    Duda, Cristian ; Frey, Gianni ; Kossmann, Donald ; Matter, Reto ; Zhou, Chong

  • Author_Institution
    ETH Zurich, Zurich
  • fYear
    2009
  • fDate
    March 29 2009-April 2 2009
  • Firstpage
    78
  • Lastpage
    89
  • Abstract
    Current search engines such as Google and Yahoo! are prevalent for searching the Web. Search on dynamic client-side Web pages is, however, either inexistent or far from perfect, and not addressed by existing work, for example on Deep Web. This is a real impediment since AJAX and Rich Internet Applications are already very common in the Web. AJAX applications are composed of states which can be seen by the user, but not by the search engine, and changed by the user using client-side events. Current search engines either ignore AJAX applications or produce false negatives. The reason is that crawling client-side code is a difficult problem that cannot be solved naively by invoking user events. The challenges are: lack of caching, duplicate states detection, very granular events, reducing the number of AJAX calls and infinite event invocation. This paper sets the stage for this new search challenge and proposes a solution: it shows how an AJAX Web application can be crawled in the granularity of the application states. A model of AJAX Web sites is presented. An AJAX Crawler and optimizations for caching and duplicate elimination are defined, and finally, the gain in search result quality and corresponding performance price are evaluated on YouTube, a real AJAX application.
  • Keywords
    Java; search engines; AJAX; Google; Web pages; Yahoo!; YouTube; caching; client-side events; deep Web; duplicate elimination; rich Internet applications; search engines; Crawlers; Data engineering; Event detection; Impedance; Internet; Java; Search engines; Uniform resource locators; Web pages; YouTube;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering, 2009. ICDE '09. IEEE 25th International Conference on
  • Conference_Location
    Shanghai
  • ISSN
    1084-4627
  • Print_ISBN
    978-1-4244-3422-0
  • Electronic_ISBN
    1084-4627
  • Type

    conf

  • DOI
    10.1109/ICDE.2009.90
  • Filename
    4812393