• DocumentCode
    598575
  • Title

    Design and analysis of data management in scalable parallel scripting

  • Author

    Zhao Zhang ; Katz, Daniel S. ; Wozniak, Justin M. ; Espinosa, Antonio ; Foster, Ian

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Chicago, Chicago, IL, USA
  • fYear
    2012
  • fDate
    10-16 Nov. 2012
  • Firstpage
    1
  • Lastpage
    11
  • Abstract
    We seek to enable efficient large-scale parallel execution of applications in which a shared filesystem abstraction is used to couple many tasks. Such parallel scripting (many-task computing, MTC) applications suffer poor performance and utilization on large parallel computers because of the volume of filesystem I/O and a lack of appropriate optimizations in the shared filesystem. Thus, we design and implement a scalable MTC data management system that uses aggregated compute node local storage for more efficient data movement strategies. We co-design the data management system with the data-aware scheduler to enable dataflow pattern identification and automatic optimization. The framework reduces the time to solution of parallel stages of an astronomy data analysis application, Montage, by 83.2% on 512 cores; decreases the time to solution of a seismology application, CyberShake, by 7.9% on 2,048 cores; and delivers BLAST performance better than mpiBLAST at various scales up to 32,768 cores, while preserving the flexibility of the original BLAST application.
  • Keywords
    astronomy computing; data analysis; file organisation; optimisation; parallel processing; astronomy data analysis application; automatic optimization; data aware scheduler; data management analysis; data movement strategies; dataflow pattern identification; filesystem I/O; filesystem abstraction; parallel computers; scalable parallel scripting; Computer architecture; Computers; Databases; Engines; Optimization; Runtime; Servers;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing, Networking, Storage and Analysis (SC), 2012 International Conference for
  • Conference_Location
    Salt Lake City, UT
  • ISSN
    2167-4329
  • Print_ISBN
    978-1-4673-0805-2
  • Type

    conf

  • DOI
    10.1109/SC.2012.44
  • Filename
    6468455