مرکز منطقه ای اطلاع رساني علوم و فناوري - Hierarchical tiling for improved superscalar performance

DocumentCode :

2634248

Title :

Hierarchical tiling for improved superscalar performance

Author :

Carter, Larry ; Ferrante, Jeanne ; Hummel, Susan Flynn

Author_Institution :

Dept. of Comput. Sci. & Eng., California Univ., San Diego, La Jolla, CA, USA

fYear :

1995

fDate :

25-28 Apr 1995

Firstpage :

239

Lastpage :

245

Abstract :

It takes more than a good algorithm to achieve high performance: inner-loop performance and data locality are also important. Tiling is a well-known method for parallelization and for improving data locality. However, tiling has the potential of being even more beneficial. At the finest granularity, it can be used to guide register allocation and instruction scheduling; at the coarsest level, it can help manage magnetic storage media. It also can be useful in overlapping data movement with computation, for instance by prefetching data from archival storage, disks and main memory into cache and registers, or by choreographing data movement between processors. Hierarchical tiling is a framework for applying both known tiling methods and new techniques to an expanded set of uses. It eases the burden on several compiler phases that are traditionally treated separately, such as scalar replacement, register allocation, generation of message passing calls, and storage mapping. By explicitly naming and copying data, it takes control of the mapping of data to memory and of the movement of data between processing elements and up and down the memory hierarchy. This paper focuses on using hierarchical tiling to exploit superscalar pipelined processors. On a simple example, it improves performance by a factor of 3, achieving perfect use of the superscalar processor´s pipeline. Hierarchical tiling is presented here as a method of hand-tuning performance; while outside the scope of this paper, the ideas can be incorporated into an automatic preprocessor or optimizing compiler

Keywords :

message passing; parallel processing; performance evaluation; archival storage; automatic preprocessor; compiler phases; data locality; hierarchical tiling; inner-loop performance; instruction scheduling; message passing; optimizing compiler; parallelization; register allocation; scalar replacement; storage mapping; superscalar performance; superscalar pipelined processors; Cache storage; Computer science; Drives; Message passing; Parallel processing; Pipelines; Prefetching; Registers; Tiles; Zinc;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Parallel Processing Symposium, 1995. Proceedings., 9th International

Conference_Location :

Santa Barbara, CA

Print_ISBN :

0-8186-7074-6

Type :

conf

DOI :

10.1109/IPPS.1995.395939

Filename :

395939

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2634248