• DocumentCode
    3717309
  • Title

    Cost and data exploration considerations for big data prediction on the cloud

  • Author

    Chris Tseng;Tien Nguyen;Chetan Sharma

  • Author_Institution
    Computer Science Dept., San Jose State University
  • fYear
    2015
  • Firstpage
    1622
  • Lastpage
    1628
  • Abstract
    Cloud services allow one to perform intense big data calculations without having to own personally a powerful enough machine. Different cloud-based virtual machines, however, offer different processor speeds at different costs, and the most cost-effective machine size may not always be obvious. We investigated different virtual machine sizes on the Microsoft Azure cloud service and also different data exploration methodologies to solve a big data prediction project using Neural Networks. It was found that one may not always get proportionally better performance with higher end expensive virtual machine settings. Direct application of Neural Network on prediction problem typically has a bottleneck in performance. We found the learning and prediction can be made better with data properties and problem nature taken into consideration. Some of our data preparation schemes will be useful for general big data prediction problem with noise or non-uniformly distributed data.
  • Keywords
    "Neural networks","Training","Big data","Virtual machining","Cloud computing","Hardware","Performance analysis"
  • Publisher
    ieee
  • Conference_Titel
    Big Data (Big Data), 2015 IEEE International Conference on
  • Type

    conf

  • DOI
    10.1109/BigData.2015.7363930
  • Filename
    7363930