• DocumentCode
    3223343
  • Title

    Diagnosis, Tuning, and Redesign for Multicore Performance: A Case Study of the Fast Multipole Method

  • Author

    Chandramowlishwaran, Aparna ; Madduri, Kamesh ; Vuduc, Richard

  • Author_Institution
    Sch. of Comput. Sci. & Eng., Georgia Inst. of Technol., Atlanta, GA, USA
  • fYear
    2010
  • fDate
    13-19 Nov. 2010
  • Firstpage
    1
  • Lastpage
    12
  • Abstract
    Given a program and a multisocket, multicore system, what is the process by which one understands and improves its performance and scalability? We describe an approach in the context of improving within-node scalability of the fast multipole method (FMM). Our process consists of a systematic sequence of modeling, analysis, and tuning steps, beginning with simple models, and gradually increasing their complexity in the quest for deeper performance understanding and better scalability. For the FMM, we significantly improve within-node scalability; for example, on a quad-socket Intel Nehalem-EX system, we show speedups of 1.7× over the previous best multithreaded implementation, 19.3× over a sequential but highly tuned (e.g., SIMD-vectorized) code, and match or outperform a state-of- the-art GPGPU implementation. Our study sheds new light on the form of a more general performance analysis and tuning process that other multicore/manycore tuning practitioners (end- user programmers) and automated performance analysis and tuning tools could themselves apply.
  • Keywords
    boundary-elements methods; multiprocessing systems; performance evaluation; FMM; fast multipole method; multicore performance; multicore system; Instruction sets; Multicore processing; Optimization; Scalability; Sockets; Tuning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing, Networking, Storage and Analysis (SC), 2010 International Conference for
  • Conference_Location
    New Orleans, LA
  • Print_ISBN
    978-1-4244-7557-5
  • Electronic_ISBN
    978-1-4244-7558-2
  • Type

    conf

  • DOI
    10.1109/SC.2010.19
  • Filename
    5644891