DocumentCode :
610423
Title :
Very fast estimation for result and accuracy of big data analytics: The EARL system
Author :
Laptev, N. ; Kai Zeng ; Zaniolo, Carlo
Author_Institution :
Univ. of California, Los Angeles, Los Angeles, CA, USA
fYear :
2013
fDate :
8-12 April 2013
Firstpage :
1296
Lastpage :
1299
Abstract :
Approximate results based on samples often provide the only way in which advanced analytical applications on very massive data sets (a.k.a. `big data´) can satisfy their time and resource constraints. Unfortunately, methods and tools for the computation of accurate early results are currently not supported in big data systems (e.g., Hadoop). Therefore, we propose a nonparametric accuracy estimation method and system to speedup big data analytics. Our framework is called EARL (Early Accurate Result Library) and it works by predicting the learning curve and choosing the appropriate sample size for achieving the desired error bound specified by the user. The error estimates are based on a technique called bootstrapping that has been widely used and validated by statisticians, and can be applied to arbitrary functions and data distributions. Therefore, this demo will elucidate (a) the functionality of EARL and its intuitive GUI interface whereby first-time users can appreciate the accuracy obtainable from increasing sample sizes by simply viewing the learning curve displayed by EARL, (b) the usability of EARL, whereby conference participants can interact with the system to quickly estimate the sample sizes needed to obtain the desired accuracies or response times, and then compare them against the accuracies and response times obtained in the actual computations.
Keywords :
data analysis; graphical user interfaces; statistical analysis; EARL system; advanced analytical applications; arbitrary functions; big data analytics; bootstrapping; data distributions; early accurate result library; intuitive GUI interface; massive data sets; statisticians; Accuracy; Big data; Computational modeling; Data mining; Error analysis; Estimation; Time factors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering (ICDE), 2013 IEEE 29th International Conference on
Conference_Location :
Brisbane, QLD
ISSN :
1063-6382
Print_ISBN :
978-1-4673-4909-3
Electronic_ISBN :
1063-6382
Type :
conf
DOI :
10.1109/ICDE.2013.6544928
Filename :
6544928
Link To Document :
بازگشت