مرکز منطقه ای اطلاع رساني علوم و فناوري - Run-time optimization of sends, receives and file I/O

DocumentCode :

2721341

Title :

Run-time optimization of sends, receives and file I/O

Author :

Natvig, Thorvald ; Elster, Anne C.

Author_Institution :

Norwegian Univ. of Sci. & Technol.(NTNU), Trondheim, Norway

fYear :

2010

fDate :

20-24 Sept. 2010

Firstpage :

Lastpage :

Abstract :

On today´s commodity clusters, achieving near-optimal speedup is very hard. We have previously shown that parallel applications using synchronous sequential MPI calls can be optimized by runtime replacement with corresponding asynchronous MPI operations. This may be achieved by protecting memory used by the MPI call from application writes until the operation is complete. However, changing the protection bits of memory pages, which alters the page table and flushes the CPU caches, may introduce a significant overhead. In the case of overlapping requests, which are common for applications with domain decompositions using border exchanges of ghost cells, this would have to be done multiple times for each communications phase. In this paper, we overcome this problem by mapping the same memory twice into the virtual address space, but with different protection bits set for each view. This allows us to optimize overlapping MPI requests without changing the page protection bits until all the requests are finished. The same method is also extended to cover basic file I/O, overlapping file operations with computation without rewriting the original application. We also add distributed locking to I/O operations, which allows aggressive read-ahead and write-merging for parallel applications, reducing the wall-clock time of the file I/O phases that commonly surround the computation of the solution.

Keywords :

application program interfaces; message passing; parallel processing; CPU cache; asynchronous MPI operation; file I/O; message passing interface; overlapping MPI request; overlapping file operation; parallel application; read-ahead; run-time optimization; runtime replacement; synchronous sequential MPI call; virtual address space; write-merging; Asynchronous communication; Instruction sets; Kernel; Layout; Optimization; Resource management;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Cluster Computing Workshops and Posters (CLUSTER WORKSHOPS), 2010 IEEE International Conference on

Conference_Location :

Heraklion, Crete

Print_ISBN :

978-1-4244-8395-2

Electronic_ISBN :

978-1-4244-8397-6

Type :

conf

DOI :

10.1109/CLUSTERWKSP.2010.5613104

Filename :

5613104

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2721341