• DocumentCode
    3474270
  • Title

    An adaptive machine learning on Map-Reduce framework for improving performance of large-scale data analysis on EC2

  • Author

    Romsaiyud, Walisa ; Premchaiswadi, Wichian

  • Author_Institution
    Grad. Sch. of Inf. Technol., Siam Univ., Bangkok, Thailand
  • fYear
    2013
  • fDate
    20-22 Nov. 2013
  • Firstpage
    1
  • Lastpage
    7
  • Abstract
    Map-Reduce is a programming for writing applications that rapidly process vast amounts of data in parallel on large cluster of computer nodes and can be deployed on cloud computing. However, to run a Map-Reduce job, many configuration parameters are required for tuning and improving the performance to set up such as number of running mappers and maximum number of reduce slots in the cluster in order to minimize the data transferred between map and reduce tasks. To say simple, the main emphasis is on reducing the job execution time as well as shuffling tweaks to tune parameters for memory management. In this paper, we introduce a machine learning model on top of Map-Reduce for automate setting of tuning parameters for Map-Reduce programs. Our model consists of three main steps; 1) describe the plan baseline marked for verification. 2) Propose a ML algorithm for learning and predicting the model, and 3) develop our automated method to run the program automatically at a specific time. In our experiments, we run Hadoop on 20-nodes cluster on EC2.
  • Keywords
    cloud computing; data analysis; learning (artificial intelligence); parallel programming; EC2; Hadoop; ML algorithm; MapReduce framework; adaptive machine learning; cloud computing; computer nodes; configuration parameters; job execution time reduction; large-scale data analysis; memory management; reduce slots; running mappers; tuning parameters; Computational modeling; File systems; Logistics; Machine learning algorithms; Optimization; Runtime; Tuning; Amazon Web Services (AWS); Job Scheduling; Machine Learning; Map-Reduce; Performance Tuning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    ICT and Knowledge Engineering (ICT&KE), 2013 11th International Conference on
  • Conference_Location
    Bangkok
  • ISSN
    2157-0981
  • Print_ISBN
    978-1-4799-2294-9
  • Type

    conf

  • DOI
    10.1109/ICTKE.2013.6756290
  • Filename
    6756290