DocumentCode :
3016269
Title :
Rough set based clustering in dense web domain
Author :
Mishra, Ravishankar ; Kumar, Pranaw ; Bhasker, B.
Author_Institution :
Syst. Dept., Indian Inst. of Manage. IT, Lucknow, India
fYear :
2012
fDate :
27-29 Nov. 2012
Firstpage :
521
Lastpage :
526
Abstract :
Clustering is a widely used technique in data mining applications. It groups the objects on the basis of similarity among them. Web has evolved enormously in past few years which resulted in sharp augmentation in number of web users and web pages. Web personalization has become a challenging task for e-Commerce based companies due to the information overload on web and increase of web users. Web users are matched with the available information in order to make personalization effective. Web usage data, coming from a single domain happens to be dense in nature as plenty of web users are fetching the pages from the same domain/ application area. This scenario is prevalent in case of e-Commerce websites. Rough set is a soft computing technique which is efficient in dealing with ambiguities present in data. In this paper we have utilized rough set based clustering using similarity upper approximation for deriving the clusters. The clusters evolve in steps and finally converge in to a well defined clustering scheme. Developers are trying to customize web sites as per the needs of specific users with the help of knowledge acquired from users´ navigational behaviour. Since user page visits are intrinsically sequential in nature, efficient clustering algorithms with suitable distance/similarity measure for sequential data is needed. In the current paper, we demonstrate the clustering task for sequence data (web page visits) in three ways namely, capturing content information, sequence information and combination of both. Experimental results suggest that the measure which captures both content and sequence forms compact clusters, thus putting the web users of similar interests in one group.
Keywords :
Web sites; approximation theory; data mining; electronic commerce; pattern clustering; rough set theory; Web pages; Web personalization; Web usage data; Web users; content information; data mining applications; dense Web domain; distance measure; e-commerce Web sites; e-commerce-based companies; information overload; knowledge acquisition; page fetching; rough set-based clustering; sequence information; sequential data; similarity measure; similarity upper approximation; soft computing technique; user navigational behaviour; user page visits; Decision support systems; Helium; Intelligent systems; Manganese; Clustering; Rough Sets; Web data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Systems Design and Applications (ISDA), 2012 12th International Conference on
Conference_Location :
Kochi
ISSN :
2164-7143
Print_ISBN :
978-1-4673-5117-1
Type :
conf
DOI :
10.1109/ISDA.2012.6416592
Filename :
6416592
Link To Document :
بازگشت