DocumentCode :
3750107
Title :
Intelligent web crawler for file safety inspection
Author :
Ling Cong Xiang;Ooi Shih Yin;Pang Ying Han
Author_Institution :
Faculty of Science and Information Technology, Multimedia University, Malacca, Malaysia
fYear :
2015
Firstpage :
309
Lastpage :
314
Abstract :
The Internet has always been growing with all the contents and information added by different types of users. Without proper storage and indexing, these contents can easily be lost in the sea of information housed by the Internet. Hence, an automated program, known as the web crawler is used to index all the contents added to the Internet. With proper configurations and settings, a web crawler can be used for other purposes besides web indexing, which include downloading files from the web. Millions or billions of files are uploaded on the Internet and for most of the sites which host these files, there are no direct indication of whether the file is safe and free of malicious codes. Therefore, this paper aims to provide a construction of a web crawler which crawls all the pages in a given website domain, and download all the possible downloadable files linked to those pages, for the purpose of file safety inspection.
Keywords :
"Crawlers","Internet","Web pages","Servers","Indexes","Google","HTML"
Publisher :
ieee
Conference_Titel :
Signal and Image Processing Applications (ICSIPA), 2015 IEEE International Conference on
Type :
conf
DOI :
10.1109/ICSIPA.2015.7412210
Filename :
7412210
Link To Document :
بازگشت