Title :
SAKU: A distributed system for data analysis in large-scale dataset based on cloud computing
Author :
Lei Qin ; Bin Wu ; Qing Ke ; Yuxiao Dong
Author_Institution :
Sch. of Comput. Sci., Beijing Univ. of Posts & Telecommun., Beijing, China
Abstract :
Data analysis has been widely used in the enterprises for its high efficiency and accuracy, especially in the field of telecommunication industry, such as User Behavior Analysis, Customer Churn Prediction, etc. However, as the exponential growth of data, traditional data analysis tools can not handle such large-scale dataset. Furthermore, as business gets more and more complicated, there is much more requirement for integration of different data analysis tools. On the other hand, traditional analysis tools lack of visualization, which makes the result hard to understand. We propose a distributed system named SAKU, which resolves those problems. In this paper, we implement some algorithms using mapreduce framework in order to process large-scale data. We also discuss every part of the system. Furthermore, we come up with a new report framework based on cloud computing for visualization of largescale data. The most important thing is, we apply this system into a scenario which meets real-world requirements by using a large volume of data obtained from the telecom operators, which demonstrates high efficiency and scalability of the system.
Keywords :
cloud computing; data analysis; data visualisation; distributed databases; very large databases; SAKU; business; cloud computing; data analysis tools; data visualization; distributed system; large-scale dataset; mapreduce; telecom operators; telecommunication industry; Algorithm design and analysis; Business; Clustering algorithms; Data analysis; Data mining; Telecommunications; cloud computing; distributed system; large-scale dataset; mapreduce; report;
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2011 Eighth International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-61284-180-9
DOI :
10.1109/FSKD.2011.6019711