Title : 
Exploiting Parallelism for Energy Efficient Source Code High Performance Computing
         
        
            Author : 
Azeemi, Naeem Zafar
         
        
            Author_Institution : 
Christian Doppler Laboratory for Design Methodology of Signal Processing Algorithms, Institute of Communications and Radio Frequency Engineering, University of Technology Vienna, Gusshausstrasse 25/389, A-1040 Vienna Austria, Email: nzafar@nt.tuwien.ac.at
         
        
        
        
        
        
            Abstract : 
Architectures are increasingly becoming difficult to fully utilize. The growing trend towards the multiple peripherals on single chip complex embedded system having multiple peripherals has fueled the energy aspect of compute and data intensive applications. Our experiments show that although high performance applications tends to be more cycle efficient, but there energy efficiency is reduced by many factors, such as optimal architecture utilization, poor compilation optimization, to name a few. Our methodology exploits parallelism, inherent in multimedia DSP applications, as well as in multimedia DSP processors. Our proposed techniques include profile based compilation-approach which makes the source-to-source transformation more energy efficient. The profile monitor identifies the application expression slacks with respect to the underlying hardware architecture in order to selectively apply different transformation schemes depending on the observed static and runtime profile and to filter out unnecessary optimization iteration. We also propose a stochastic filtering technique to further reduce the optimization search space and hence offline compilation overhead due to huge compiler optimization options. Our experiments show that the proposed techniques increase the parallelism by close to 51% for Viterbi decoder, 79% for MPEG-2, 32% for H-263, and 84% for MPEG-4 without loosing performance benefits.
         
        
            Keywords : 
digital signal processing chips; embedded systems; parallel architectures; source coding; stochastic processes; H-263; MPEG-2; MPEG-4; Viterbi decoder; compiler optimization; energy efficient source code; high performance computing; multimedia DSP applications; multimedia DSP processors; multiple peripherals; single chip complex embedded system; source-to-source transformation; stochastic filtering technique; Computer architecture; Computer peripherals; Digital signal processing; Embedded computing; Embedded system; Energy efficiency; High performance computing; Monitoring; Optimizing compilers; Parallel processing; Embedded Applications; Genetic Algorithm; HPC (High Performance Computing); Low Energy; Source-to-Source Transformation (StS);
         
        
        
        
            Conference_Titel : 
Industrial Technology, 2006. ICIT 2006. IEEE International Conference on
         
        
            Conference_Location : 
Mumbai
         
        
            Print_ISBN : 
1-4244-0726-5
         
        
            Electronic_ISBN : 
1-4244-0726-5
         
        
        
            DOI : 
10.1109/ICIT.2006.372685