DocumentCode :
2634248
Title :
Hierarchical tiling for improved superscalar performance
Author :
Carter, Larry ; Ferrante, Jeanne ; Hummel, Susan Flynn
Author_Institution :
Dept. of Comput. Sci. & Eng., California Univ., San Diego, La Jolla, CA, USA
fYear :
1995
fDate :
25-28 Apr 1995
Firstpage :
239
Lastpage :
245
Abstract :
It takes more than a good algorithm to achieve high performance: inner-loop performance and data locality are also important. Tiling is a well-known method for parallelization and for improving data locality. However, tiling has the potential of being even more beneficial. At the finest granularity, it can be used to guide register allocation and instruction scheduling; at the coarsest level, it can help manage magnetic storage media. It also can be useful in overlapping data movement with computation, for instance by prefetching data from archival storage, disks and main memory into cache and registers, or by choreographing data movement between processors. Hierarchical tiling is a framework for applying both known tiling methods and new techniques to an expanded set of uses. It eases the burden on several compiler phases that are traditionally treated separately, such as scalar replacement, register allocation, generation of message passing calls, and storage mapping. By explicitly naming and copying data, it takes control of the mapping of data to memory and of the movement of data between processing elements and up and down the memory hierarchy. This paper focuses on using hierarchical tiling to exploit superscalar pipelined processors. On a simple example, it improves performance by a factor of 3, achieving perfect use of the superscalar processor´s pipeline. Hierarchical tiling is presented here as a method of hand-tuning performance; while outside the scope of this paper, the ideas can be incorporated into an automatic preprocessor or optimizing compiler
Keywords :
message passing; parallel processing; performance evaluation; archival storage; automatic preprocessor; compiler phases; data locality; hierarchical tiling; inner-loop performance; instruction scheduling; message passing; optimizing compiler; parallelization; register allocation; scalar replacement; storage mapping; superscalar performance; superscalar pipelined processors; Cache storage; Computer science; Drives; Message passing; Parallel processing; Pipelines; Prefetching; Registers; Tiles; Zinc;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel Processing Symposium, 1995. Proceedings., 9th International
Conference_Location :
Santa Barbara, CA
Print_ISBN :
0-8186-7074-6
Type :
conf
DOI :
10.1109/IPPS.1995.395939
Filename :
395939
Link To Document :
بازگشت