Title :
Web Objects Clustering Using Transaction Log
Author :
Rongfei, Jia ; Maozhong, Jin ; Xiaobo, Wang
Author_Institution :
Sch. of Comput. Sci. & Technol., Beijing Univ. of Aeronaut. & Astronaut., Beijing, China
Abstract :
In this paper, we present a novel method for clustering web objects. Most of existing methods aren´t sufficient to explore similar objects, because the basic data, which include attributes of objects, click-through data, and link data, are often sparse, scarce or difficult to obtain. In contrast, the information we exploit is transaction log, which is more common, denser as well as noisier. To reduce the influence of the noises, we calculate the similarity in two steps. Firstly, we use a basic similarity to discover objects´ neighbors. The objects are represented by vectors consisting of their neighbors. Secondly, the cosine similarity of the object vectors is calculated for clustering. Experiments on synthetic data show that our method is robust against noises. Using noisy data, we increase the precision by 10%. Finally, we show real clustering results based on a movie dataset and achieve the coverage of 76% and the precision of 60%.
Keywords :
Web services; data analysis; data mining; interference suppression; pattern clustering; click-through data; cosine similarity; link data; movie dataset; noise reduction; transaction log; web objects clustering; Clustering methods; Computer science; Data mining; Feature extraction; Information filtering; Information filters; Motion pictures; Noise reduction; Noise robustness; Space technology; clustering; log mining; query similarity;
Conference_Titel :
Knowledge Discovery and Data Mining, 2010. WKDD '10. Third International Conference on
Print_ISBN :
978-1-4244-5397-9
Electronic_ISBN :
978-1-4244-5398-6
DOI :
10.1109/WKDD.2010.69