DocumentCode :
2992968
Title :
Design and Implementation of Basic Educational Web Resources Gathering System
Author :
Chaojun, Xu
Author_Institution :
Data Min. Lab., Nanjing Normal Univ., Nanjing, China
fYear :
2011
fDate :
24-28 Sept. 2011
Firstpage :
51
Lastpage :
55
Abstract :
This paper introduces a topic specific web crawling system, which gathers basic educational resources from the web, and indexes them for the purpose of basic educational users. Compared to other similar theme based crawling system, the crawler integrates fuzzy rule based algorithm and VSM text analysis technology together to predicting each URL´s relevancy to basic education while parsing current downloaded page HTML code. So, the system need not to save and retrieve low relevant URLs, and improve the system´s whole efficiency greatly.
Keywords :
Internet; educational computing; fuzzy reasoning; hypermedia markup languages; indexing; relevance feedback; search engines; text analysis; HTML code; VSM text analysis technology; Web crawling system; basic educational Web resources gathering system design; fuzzy rule based algorithm; indexing; low relevant URL retrieval; Accuracy; Cognition; Crawlers; Educational institutions; Internet; Mathematical model; Basic educational resources; Fuzzy rule reasoning; Topic specific crawling;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Complexity and Data Mining (IWCDM), 2011 First International Workshop on
Conference_Location :
Nanjing, Jiangsu
Print_ISBN :
978-1-4577-2007-9
Type :
conf
DOI :
10.1109/IWCDM.2011.20
Filename :
6128416
Link To Document :
بازگشت