Title :
Discovery and Analysis of Usage Data Based on Hadoop for Personalized Information Access
Author :
Dawen Xia ; Zhuobo Rong ; Yanhui Zhou ; Binfeng Wang ; Yantao Li ; Zili Zhang
Author_Institution :
Sch. of Comput. & Inf. Sci., Southwest Univ., Chongqing, China
Abstract :
The discovery and analysis of valuable information hidden in the usage data become more and more important with the exponential growth of Web users, for offering personalized information access. Since the traditional methods are unable to effectively solve the tasks of mining semi-structured and/or unstructured data in the single platform, in this paper, we propose three methods for respectively mining user browsing preference, visiting frequency and participating characteristics, based on the Hadoop cluster by MapReduce. Moreover, we apply our methods to the Web server logs and Developer mailing lists, and analyze the visualization of mining results in order to gain a deeper understanding of user access patterns and interactive behaviors. The experimental results show that our methods can provide further insights into some useful information from usage data for decision making with a good speedup and scalability.
Keywords :
Internet; data loggers; data mining; data visualisation; information retrieval; Developer mailing lists; Hadoop cluster; MapReduce; Web server logs; Web user exponential growth; decision making; interactive behaviors; participating characteristics; personalized information access; usage data analysis; usage data discovery; user access patterns; user browsing preference mining; valuable information analysis; valuable information discovery; visiting frequency; visualization; Data mining; Electronic mail; NASA; Scalability; Visualization; Web servers; Web sites; big data analytics; data visualization; hadoop mapreduce; web usage mining;
Conference_Titel :
Computational Science and Engineering (CSE), 2013 IEEE 16th International Conference on
Conference_Location :
Sydney, NSW
DOI :
10.1109/CSE.2013.137