Title :
Efficient communication using total-exchange
Author :
Rao, Satish ; Suel, Torsten ; Tsantilas, Thanasis ; Goudreau, Mark
Author_Institution :
NEC Res. Inst., Princeton, NJ, USA
Abstract :
A central question in parallel computing is to determine the extent to which one can write parallel programs using a high-level, general-purpose, and architecture-independent programming language and have them executed on a variety of parallel and distributed architectures without sacrificing efficiency. A large body of research suggests that, at least in theory, general-purpose parallel computing is indeed possible provided certain conditions are met: an excess of logical parallelism in the program, and the ability of the target architecture to efficiently realize balanced communication patterns. The canonical example of a balanced communication pattern is an h-relation, in which each processor is the origin and destination of at most h messages. A plethora of protocols has been designed for routing h-relations in a variety of networks. The goal has been to minimize the value of h while guaranteeing delivery of the messages within a time constant factor from optimal. In this paper we describe protocols that meet the most stringent efficiency requirement, namely delivery of messages within time that is a lower order additive term from the best achievable. Such protocols are called 1-optimal. While these protocols achieve 1-optimality only for heavily loaded networks, that is, for large values of h, they are remarkable for their simplicity in that they only use the total-exchange communication primitive. The total-exchange can be realized in many networks using very simple, contention-free, and extremely efficient schemes. The technical contribution of this paper is a protocol to route random h-relations in an N-processor network using h/N(1+o(1))+O(log log N) total-exchange rounds with high probability. Using message duplication, we can improve the bound to h/N(1+o(1))+O(log*N). This improves upon the h/N(1+o(1))+O(log N) bound of Gerbessiotis and Valiant. While our theoretical improvements are modest, our experimental results show an improvement over the protocol of A. Gerebessiotis and L.G. Valiant
Keywords :
parallel processing; parallel programming; protocols; N-processor network; architecture-independent programming language; balanced communication patterns; h-relation; logical parallelism; parallel computing; parallel programs; protocols; Computer architecture; Computer languages; Computer science; Concurrent computing; Context; National electric code; Optical computing; Parallel processing; Phase change random access memory; Routing protocols;
Conference_Titel :
Parallel Processing Symposium, 1995. Proceedings., 9th International
Conference_Location :
Santa Barbara, CA
Print_ISBN :
0-8186-7074-6
DOI :
10.1109/IPPS.1995.395984