Title :
Crawling web pages with application in online advertises monitoring system
Author :
Xie Zhengao ; Su Shoubao ; Xu Huali
Author_Institution :
Sch. of Software, Univ. of Sci. & Technol. of China, Hefei, China
Abstract :
Due to the forms and features of online advertising, an effective web crawling page method, called `Spider´, is designed and implemented by analyzing the information carriers and script codes of web pages. Drawing on the basis of the search engine techniques, a row of heavy method is proposed by employing the preemptive multi-threading technique. It is used to solve the excessive consumption of system resources and network bandwidth in search on the Internet with the Spider to `crawl´ the duplication of information downloaded.
Keywords :
Internet; advertising data processing; information retrieval; multi-threading; search engines; Internet; Spider; Web crawling page method; multithreading technique; online advertise monitoring system; search engine techniques; HTML; World Wide Web; Internet; Spider; crawling web pages; online advertising;
Conference_Titel :
Circuits,Communications and System (PACCS), 2010 Second Pacific-Asia Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-7969-6
DOI :
10.1109/PACCS.2010.5627009