DocumentCode
483197
Title
Design and Implementation of University Focused Crawler Based on BP Network Classifier
Author
Jiang, Hua ; Han, Bing ; Ying Lin ; Dan Zuo ; Yong Xing Ge
Author_Institution
Comput. Sch., Northeast Normal Univ., Changchun
fYear
2009
fDate
23-25 Jan. 2009
Firstpage
44
Lastpage
47
Abstract
The rapid growth of the World-Wide Web poses unprecedented scaling challenges for general-purpose crawlers and search engines. Crawling the Web quickly and entirely is an expensive, unrealistic goal because of the required hardware and network resources. A focused crawler is an agent that targets a particular topic and visits and gathers only a relevant, narrow Web segment while trying not to waste resources on irrelevant material. It can be used to build domain-specific Web search portals and online personalized search tools. In this paper, we describe the design and implementation of a university focused crawler that runs on BP network classifier for prediction of the links leading to relevant pages. We present the flow of the system, discuss the performance, report the experimental results based on it. Our experiments show that the BP classifier performs very well in obtaining accurate relevant university Web resources.
Keywords
Internet; backpropagation; online front-ends; pattern classification; search engines; BP network classifier; World-Wide Web; domain-specific Web search portal; online personalized search tool; university focused crawler; Computer networks; Crawlers; Data mining; Educational institutions; Hardware; Portals; Search engines; Uniform resource locators; Waste materials; Web pages; BP network; Crawler; Web resources; domain specific; search engines;
fLanguage
English
Publisher
ieee
Conference_Titel
Knowledge Discovery and Data Mining, 2009. WKDD 2009. Second International Workshop on
Conference_Location
Moscow
Print_ISBN
978-0-7695-3543-2
Type
conf
DOI
10.1109/WKDD.2009.77
Filename
4771874
Link To Document