DocumentCode :
3725303
Title :
SAFSB: A self-adaptive focused crawler
Author :
Dilip kumar Sharma;Mohd Aamir Khan
Author_Institution :
Department of Computer Engineering & Applications, G.L.A. University, Mathura, India
fYear :
2015
Firstpage :
719
Lastpage :
724
Abstract :
There are about 3 billion indexed websites present in the WWW. Not all websites do not belong to a particular topic are indexed by a search engine say google.com, there are online platforms available where different users help the person asking for a (Universal Resource Locator) URL containing a topical information. To verify the authenticity and validity of the URL, an empirical methodology and its ranking to major its relevancy is presented through this paper. To semantically expand the search, topic ontology is used for the pre-processing of the focused crawler to make search more effective. The performance of our web crawler is further increased by using the ontology based learning which is constantly being updated by dictionary based learning and related words of the named entities. The harvest ratio is used which represents the ratio between the relevant pages and the crawled pages shows a significant improvement than the previous methods.
Keywords :
"Crawlers","Ontologies","Semantics","Computers","Uniform resource locators","Pragmatics","Next generation networking"
Publisher :
ieee
Conference_Titel :
Next Generation Computing Technologies (NGCT), 2015 1st International Conference on
Type :
conf
DOI :
10.1109/NGCT.2015.7375215
Filename :
7375215
Link To Document :
بازگشت