Title : 
Model-Guided Empirical Optimization for Multimedia Extension Architectures: A Case Study
         
        
            Author : 
Chen, Chun ; Shin, Jaewook ; Kintali, Shiva ; Chame, Jacqueline ; Hall, Mary
         
        
            Author_Institution : 
Inf. Sci. Inst., Southern California Univ., Marina del Rey, CA
         
        
        
        
        
        
            Abstract : 
Compiler technology for multimedia extensions must effectively utilize not only the SIMD compute engines but also the various levels of the memory hierarchy: superword registers, multi-level caches and TLB. In this paper, we describe a compiler that combines optimization across all levels of the memory hierarchy with automatic generation of SIMD code for multimedia extensions. At the high-level, model-guided empirical optimization is used to transform code to optimize for all levels of the memory hierarchy. This compiler interacts with a backend compiler exploiting superword-level parallelism that takes sequential code as input and produces SIMD code. This paper discusses how we have combined these technologies into a single framework. Through a case study with matrix multiply, we observe performance results that outperform the hand-tuned Intel MKL library, and achieve performance that is within 4% of the ATLAS self-tuning library with architectural defaults and more than 4X faster than the native Intel compiler.
         
        
            Keywords : 
cache storage; multimedia systems; optimising compilers; parallel architectures; ATLAS self-tuning library; Intel compiler; SIMD compute engine; automatic SIMD code generation; cache memory hierarchy; code transformation; hand-tuned Intel MKL library; matrix multiplication; model-guided empirical optimization; multilevel cache; multimedia extension architecture; optimising compiler; superword register; Aggregates; Concurrent computing; Engines; Libraries; Marine technology; Multimedia computing; Optimizing compilers; Parallel processing; Registers; Space technology;
         
        
        
        
            Conference_Titel : 
Parallel and Distributed Processing Symposium, 2007. IPDPS 2007. IEEE International
         
        
            Conference_Location : 
Long Beach, CA
         
        
            Print_ISBN : 
1-4244-0910-1
         
        
            Electronic_ISBN : 
1-4244-0910-1
         
        
        
            DOI : 
10.1109/IPDPS.2007.370641