DocumentCode
3249847
Title
Mass log data processing and mining based on Hadoop and cloud computing
Author
Yu, Hongyong ; Wang, Deshuai
Author_Institution
State Key Lab. of Software Archit., Neusoft Corp., Shenyang, China
fYear
2012
fDate
14-17 July 2012
Firstpage
197
Lastpage
202
Abstract
With the rapid development of the Internet, SaaS applications delivered as services through internet become an important alternative of traditional software. While using the services, users need real time usage information, and they also need to dig out useful knowledge. As a result, data processing and data mining techniques are designed to cope with such problems, and using log data is an effective method to record the SaaS usage information in a standard format. However, as the size of data grows, traditional distributed log data processing systems are not able to processing massive log data from SaaS applications with millions of users. This paper proposes a mass log data processing and data mining methods based on Hadoop to achieve scalability and performance. The model, process, architecture, and implementation of the data processing and mining methods are proposed, and the experimental results is shown and analyzed to prove the effectiveness of the methods.
Keywords
cloud computing; data mining; distributed processing; Hadoop computing; Internet; SaaS applications; cloud computing; distributed log data processing systems; mass log data mining; mass log data processing; Algorithm design and analysis; Data mining; Data processing; Distributed databases; Real time systems; Servers; Hadoop; business intelligence; data mining; mass data processing; real time statistics;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Science & Education (ICCSE), 2012 7th International Conference on
Conference_Location
Melbourne, VIC
Print_ISBN
978-1-4673-0241-8
Type
conf
DOI
10.1109/ICCSE.2012.6295056
Filename
6295056
Link To Document