• DocumentCode
    243146
  • Title

    Optimizing performance and power consumption for an ARM-based big data cluster

  • Author

    Kaewkasi, Chanwit ; Srisuruk, Wichai

  • Author_Institution
    Sch. of Comput. Eng., Suranaree Univ. of Technol., Nakhon Ratchasima, Thailand
  • fYear
    2014
  • fDate
    22-25 Oct. 2014
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    Hadoop is a de facto platform for processing both semi-structured and unstructured data. To save cost, corporates usually run Hadoop instances on the public cloud. Unfortunately, security exploits today are widely spread easily enough to put the public cloud in danger. Running corporate-owned Hadoop clusters is more feasible. However, operating private data centers is costly. An alternative would be developing Hadoop clusters with low-cost system-on-chip boards. Hadoop on a cluster made with ARM system-on-chip boards has not been widely studied. Several works previously showed that they could not able to run Hadoop properly on these limited devices. Recently, there has been a work that successfully processed a non-trvial size of data, 34 GB, with Hadoop on an ARM cluster in acceptable time. This work further explored an opportunity to tune performance and study power consumption of a 22-node ARM-based cluster. The whole architecture of software stack, including the runtime, data integrity verification and data compression, is studied and improved. The work reported in this paper achieved the processing rate at almost 0.9 GB/min, successfully processed the same benchmarks from the previous work by roughly 38 minutes.
  • Keywords
    Big Data; multiprocessing systems; parallel processing; power consumption; system-on-chip; 22-node ARM-based cluster; ARM cluster; ARM system-on-chip boards; ARM-based Big Data cluster; Hadoop clusters; data compression; data integrity verification; low-cost system-on-chip boards; performance optimization; power consumption; semistructured data processing; software stack architecture; unstructured data processing; Big Data; Hadoop; cluster;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    TENCON 2014 - 2014 IEEE Region 10 Conference
  • Conference_Location
    Bangkok
  • ISSN
    2159-3442
  • Print_ISBN
    978-1-4799-4076-9
  • Type

    conf

  • DOI
    10.1109/TENCON.2014.7022399
  • Filename
    7022399