• DocumentCode
    3259554
  • Title

    Tiny GPU Cluster for Big Spatial Data: A Preliminary Performance Evaluation

  • Author

    Jianting Zhang ; Simin You ; Gruenwald, Le

  • Author_Institution
    Dept. of Comput. Sci., City Coll. of New York, New York, NY, USA
  • fYear
    2015
  • fDate
    June 29 2015-July 2 2015
  • Firstpage
    142
  • Lastpage
    147
  • Abstract
    GPU-equipped computing nodes have much higher ratios between floating point computing power (in the order of TFlops and is fast growing) and network bandwidth (in the order of Gbps and remains stable) than regular computing nodes at which Hadoop-based systems are targeting. The gap makes efficient and scalable processing of large-scale data challenging, especially for geo-referenced spatial (or geospatial) data, whose processing is both data intensive and computing intensive. We aim at developing a tiny GPU cluster using Nvidia Tegra K1 (TK1) System on Chip (SoC) boards as a downscaled, low-cost GPU cluster for Big (Spatial) Data research. The tiny GPU cluster is equipped with standard gigabyte Ethernet network while has much less computing power and energy footprint when compared with a regular GPU cluster and represents a new platform with more balanced compute to communication ratio. We have ported our implementations of both single-node technologies for point-in-polygon test based spatial joins and the lightweight distributed execution engine originally developed for regular clusters to the tiny GPU cluster. We evaluate its performance on two real world geospatial applications with various settings and experiment results have demonstrated good scalability. Preliminary analysis on the scaling effect between the tiny cluster and a regular Amazon EC2 cluster using a simplified model suggest that the ARM-based CPU of the TK1 board is likely to achieve better energy efficiency while the Nvidia GPU of the TK1 board might be less efficient when compared with desktop/server grade GPUs, in both standalone and 4-node cluster settings.
  • Keywords
    Big Data; graphics processing units; local area networks; parallel processing; pattern clustering; system-on-chip; visual databases; 4-node cluster settings; ARM-based CPU; Hadoop-based systems; Nvidia GPU; Nvidia Tegra K1; SoC boards; TFlops; TK1 board; big spatial data; communication ratio; computing nodes; desktop grade; energy efficiency; energy footprint; floating point computing power; geo-referenced spatial data; geospatial data; large-scale data challenging; lightweight distributed execution engine; network bandwidth; point-in-polygon test based spatial joins; power footprint; preliminary performance evaluation; real world geospatial applications; regular Amazon EC2 cluster; server grade; standalone cluster settings; standard gigabyte Ethernet network; system on chip boards; tiny GPU cluster; Big data; Distributed databases; Engines; Geospatial analysis; Graphics processing units; Spatial databases; System-on-chip; Distributed Computing; GPU; Lightweight;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Distributed Computing Systems Workshops (ICDCSW), 2015 IEEE 35th International Conference on
  • Conference_Location
    Columbus, OH
  • Type

    conf

  • DOI
    10.1109/ICDCSW.2015.33
  • Filename
    7165097