DocumentCode
3717309
Title
Cost and data exploration considerations for big data prediction on the cloud
Author
Chris Tseng;Tien Nguyen;Chetan Sharma
Author_Institution
Computer Science Dept., San Jose State University
fYear
2015
Firstpage
1622
Lastpage
1628
Abstract
Cloud services allow one to perform intense big data calculations without having to own personally a powerful enough machine. Different cloud-based virtual machines, however, offer different processor speeds at different costs, and the most cost-effective machine size may not always be obvious. We investigated different virtual machine sizes on the Microsoft Azure cloud service and also different data exploration methodologies to solve a big data prediction project using Neural Networks. It was found that one may not always get proportionally better performance with higher end expensive virtual machine settings. Direct application of Neural Network on prediction problem typically has a bottleneck in performance. We found the learning and prediction can be made better with data properties and problem nature taken into consideration. Some of our data preparation schemes will be useful for general big data prediction problem with noise or non-uniformly distributed data.
Keywords
"Neural networks","Training","Big data","Virtual machining","Cloud computing","Hardware","Performance analysis"
Publisher
ieee
Conference_Titel
Big Data (Big Data), 2015 IEEE International Conference on
Type
conf
DOI
10.1109/BigData.2015.7363930
Filename
7363930
Link To Document