Title :
Web Information Organization Using Keyword Distillation Based Clustering
Author :
Shibata, Tomohide ; Bamba, Yasuo ; Shinzato, Keiji ; Kurohashi, Sadao
Abstract :
This paper describes a system that conducts search result clustering for several thousands of Web pages, and elaborates cluster labels through keyword distillation. Keyword distillation is a method that properly handles spelling variations, transliterations, synonyms, inclusion relations and word ambiguity, using linguistic resources and contexts of a user´s query. The system provides a clustering result from 1,000 pages in less than one minute by taking advantage of a search engine infrastructure and grid computing environment. Experimental results show that the system correctly merged synonymous keywords and is useful for finding topics hidden in the lower-ranked pages in a search result.
Keywords :
Clustering methods; Conferences; Data mining; Educational products; Grid computing; Intelligent agent; Navigation; Search engines; Web pages; clustering; keyword unification; open search engine;
Conference_Titel :
Web Intelligence and Intelligent Agent Technologies, 2009. WI-IAT '09. IEEE/WIC/ACM International Joint Conferences on
Conference_Location :
Milan, Italy
Print_ISBN :
978-0-7695-3801-3
Electronic_ISBN :
978-1-4244-5331-3
DOI :
10.1109/WI-IAT.2009.57