DocumentCode :
1837770
Title :
Challenges and design issues of an Arabic web crawler
Author :
Abdeen, Mohammad ; Tolba, Mohamed F.
Author_Institution :
Dept. of Comput. Sci., Ain-Shams Univ., Cairo, Egypt
fYear :
2010
fDate :
Nov. 30 2010-Dec. 2 2010
Firstpage :
203
Lastpage :
206
Abstract :
The world-wide-web has become the favorite destiny of information seekers across the globe. With its massive amount of information that includes billions of web pages, information for just about any topic is a click-of-finger away. The Arabic web represents an important portion of the web. With Arabic as the 5th most spoken language in the world and with the increasing number of Arabic internet users at exponential rates, it is becoming important to make the Arabic web content available. Many search engines exist such as Google, Msn, and Yahoo. However, these search engines are not designed with proper consideration of the Arabic language and its special nature. This paper presents some of the work while developing an Arabic search engine. The web crawler is the focus of this paper. We present important challenges and design issues of the design of a distributed web crawler for the Arabic search engine we are developing. We also uncover some of the facts about the Arabic web.
Keywords :
Internet; natural language processing; search engines; Arabic Web crawler; Arabic search engine; Google; Msn; Yahoo; distributed Web crawler; spoken language; Crawlers; Encoding; Engines; Internet; Memory management; Search engines; Web pages; Arabic Web; Distributed Applications; Information retreival; Search Engines; Web crawlers;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Engineering and Systems (ICCES), 2010 International Conference on
Conference_Location :
Cairo
Print_ISBN :
978-1-4244-7040-2
Type :
conf
DOI :
10.1109/ICCES.2010.5674854
Filename :
5674854
Link To Document :
بازگشت