• DocumentCode
    3198367
  • Title

    High-Performance Graph Analytics on Manycore Processors

  • Author

    Slota, George M. ; Rajamanickam, Sivasankaran ; Madduri, Kamesh

  • Author_Institution
    Comput. Sci. & Eng., Pennsylvania State Univ., University Park, PA, USA
  • fYear
    2015
  • fDate
    25-29 May 2015
  • Firstpage
    17
  • Lastpage
    27
  • Abstract
    The divergence in the computer architecture landscape has resulted in different architectures being considered mainstream at the same time. For application and algorithm developers, a dilemma arises when one must focus on using underlying architectural features to extract the best performance on each of these architectures, while writing portable code at the same time. We focus on this problem with graph analytics as our target application domain. In this paper, we present an abstraction-based methodology for performance-portable graph algorithm design on manicure architectures. We demonstrate our approach by systematically optimizing algorithms for the problems of breadth-first search, color propagation, and strongly connected components. We use Kokkos, a manicure library and programming model, for prototyping our algorithms. Our portable implementation of the strongly connected components algorithm on the NVIDIA Tesla K40M is up to 3.25× faster than a state-of-the-art parallel CPU implementation on a dual-socket Sandy Bridge compute node.
  • Keywords
    feature extraction; graph theory; multiprocessing systems; parallel architectures; tree searching; Kokkos manicure library; NVIDIA Tesla K40M; abstraction-based methodology; architectural feature extraction; breadth-first search problems; color propagation; computer architecture landscape; dual-socket Sandy Bridge compute node; high-performance graph analytics; manicure architectures; manycore processors; optimizing algorithms; parallel CPU; performance-portable graph algorithm design; portable code writing; programming model; strongly connected components algorithm; Arrays; Color; Instruction sets; Optimization; Parallel processing; Silicon; Synchronization; BFS; GPU; color propagation; graph computations; parallel performance; portability;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing Symposium (IPDPS), 2015 IEEE International
  • Conference_Location
    Hyderabad
  • ISSN
    1530-2075
  • Type

    conf

  • DOI
    10.1109/IPDPS.2015.54
  • Filename
    7161272