DocumentCode :
2313314
Title :
Improving web search result categorization using knowledge from web taxonomy
Author :
Jinarat, Supakpong ; Haruechaiyasak, Choochart ; Rungsawang, Amon
Author_Institution :
Dept. of Comput. Eng., Kasetsart Univ., Bangkok
fYear :
2009
fDate :
6-9 May 2009
Firstpage :
726
Lastpage :
730
Abstract :
Finding relevant information from a long list of search results returned by general search engine can be difficult. The categorization technique is applied to solve this problem. One possible approach is by using some external resources such as Open Directory Project (ODP) to map search result´s URLs into the ODP categories. However, the ODP can only map some part of all URLs that returned from search engine. In this paper, we present a method of Web search result categorization based on classification technique by applying external information from the ODP. First, we categorize the search results by using information from the ODP as training data set. We then generate the categorizers from the training data based on centroid-based classification algorithm for categorized remaining uncategorized search results. The experimental result of proposed method achieved high performance of categorization comparing with an effective ODP classifier from previous work.
Keywords :
Internet; data analysis; learning (artificial intelligence); pattern classification; search engines; Web search result categorization; Web taxonomy; centroid-based classification; data set training; open directory project; pattern classification; search engine; Clustering algorithms; Knowledge engineering; Laboratories; Organizing; Search engines; Taxonomy; Training data; Uniform resource locators; Web pages; Web search;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology, 2009. ECTI-CON 2009. 6th International Conference on
Conference_Location :
Pattaya, Chonburi
Print_ISBN :
978-1-4244-3387-2
Electronic_ISBN :
978-1-4244-3388-9
Type :
conf
DOI :
10.1109/ECTICON.2009.5137150
Filename :
5137150
Link To Document :
بازگشت