DocumentCode :
119407
Title :
A Reference Framework for the Automated Exploration of Web Applications
Author :
Le Breton, Gabriel ; Bergeron, Nicolas ; Halle, Sylvain
Author_Institution :
Dept. d´Inf. et de Math., Univ. du Quebec a Chicoutimi, Chicoutimi, QC, Canada
fYear :
2014
fDate :
4-7 Aug. 2014
Firstpage :
81
Lastpage :
90
Abstract :
Web crawling is the process of exhaustively exploring the contents of a web site or application through automated means. While the results of such a crawling can be put through numerous uses ranging from a simple backup to comprehensive testing and analysis, features of modern-day applications prevent crawlers from properly exploring applications. We provide an in-depth analysis of 15 such features, and report on their presence in a study of 16 real-world web sites. Based on that study, we develop a configurable web application where the presence of each such feature can be turned on or off, aimed as a test bench where existing crawlers can be compared in a uniform way. Our results, which are the first exhaustive comparison of available crawlers, indicates areas where future work should be aimed.
Keywords :
Internet; information retrieval; Web application exploration; Web crawling; Web site; Browsers; Crawlers; HTML; Navigation; Servers; Testing; Web sites; benchmark; crawlers; web applications;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Engineering of Complex Computer Systems (ICECCS), 2014 19th International Conference on
Conference_Location :
Tianjin
Print_ISBN :
978-1-4799-5481-0
Type :
conf
DOI :
10.1109/ICECCS.2014.20
Filename :
6923122
Link To Document :
بازگشت