DocumentCode :
1960063
Title :
Distributed Ontology-Driven Focused Crawling
Author :
Campos, Rui ; Rojas, O. ; Marin, Mario ; Mendoza, M.
Author_Institution :
Comput. Sci. Dept., Univ. de Santiago, Santiago, Chile
fYear :
2013
fDate :
Feb. 27 2013-March 1 2013
Firstpage :
108
Lastpage :
115
Abstract :
Focused crawlers are programs designed to download web pages which are relevant to specific topics. Using information gathered at running time, focused crawlers explore the web following promissory hyperlinks and fetching only pages which appear to be relevant. These crawlers are receiving increasing attention because they favor the construction of vertical search engines, allowing users to focus on specific topics of information, providing higher accuracy and reducing computational costs involved in query processing. In this article, we introduce an efficient focused crawling strategy which considers a number of distributed focused crawlers which recover relevant pages to a given knowledge domain. We propose an ontology-based knowledge representation approach to drive the crawler to specific segments of the web. Experimental results with actual samples of the Web show the feasibility and efficiency of our strategy.
Keywords :
Internet; ontologies (artificial intelligence); query processing; search engines; Web pages; distributed ontology-driven focused crawling; efficient focused crawling strategy; fetching; ontology-based knowledge representation approach; promissory hyperlinks; query processing; vertical search engines; Crawlers; HTML; Ontologies; Search engines; Uniform resource locators; Vectors; Web pages; Focused crawling; ontologies; vertical search;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel, Distributed and Network-Based Processing (PDP), 2013 21st Euromicro International Conference on
Conference_Location :
Belfast
ISSN :
1066-6192
Print_ISBN :
978-1-4673-5321-2
Electronic_ISBN :
1066-6192
Type :
conf
DOI :
10.1109/PDP.2013.23
Filename :
6498540
Link To Document :
بازگشت