Title :
An Algorithm OFC for the Focused Web Crawler
Author_Institution :
Zhejiang Inst. of Commun. & Media, Hangzhou
Abstract :
Based on reinforcement learning and fuzzy clustering theory, this paper proposes an algorithm OFC for the focused web crawler. We combine the naive Bayes classifiers with the fuzzy center-averaged clustering method to calculate the fuzzy memberships that are used to solve the valLie function mapping the hyperlinks to the future discounted rewards. Online estimation of the immediate action reward and classification of the newly crawled web pages incrementally enhance the crawling performance. We give several experiments to test the performance of algorithm OFC.
Keywords :
Bayes methods; Internet; fuzzy set theory; learning (artificial intelligence); pattern clustering; focused Web crawler; fuzzy center-averaged clustering method; fuzzy clustering theory; fuzzy memberships; naive Bayes classifiers; reinforcement learning; Clustering algorithms; Clustering methods; Crawlers; Cybernetics; Fuzzy sets; Machine learning; Machine learning algorithms; Testing; Uniform resource locators; Web pages; Clustering; Focused crawler; Fuzzy set; OFC;
Conference_Titel :
Machine Learning and Cybernetics, 2007 International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
978-1-4244-0973-0
Electronic_ISBN :
978-1-4244-0973-0
DOI :
10.1109/ICMLC.2007.4370856