Title : 
Exploiting program branch probabilities in hardware compilation
         
        
            Author : 
Styles, Henry ; Luk, Wayne
         
        
            Author_Institution : 
Dept. of Comput., Imperial Coll., London, UK
         
        
        
        
        
        
        
            Abstract : 
This paper explores using information about program branch probabilities to optimize the results of hardware compilation. The basic premise is to promote utilization by dedicating more resources to branches which execute more frequently. A new hardware compilation and flow control scheme are presented which enable the computation rate of different branches to be matched to the observed branch probabilities. We propose an analytical queuing network performance model to determine the optimal settings for basic block computation rates given a set of observed branch probabilities. An experimental hardware compilation system has been developed to evaluate this approach. The branch optimization design space is characterized in an experimental study for Xilinx Virtex FPGAs of two complex applications: video feature extraction and progressive refinement radiosity. For designs of equal performance, branch-optimized designs require 24 percent and 27.5 percent less area. For designs of equal area, branch optimized designs run up to three times faster. Our analytical performance model is shown to be highly accurate with relative error between 0.12 and 1.1 × 10-4.
         
        
            Keywords : 
data flow computing; feature extraction; field programmable gate arrays; optimisation; program compilers; program control structures; queueing theory; reconfigurable architectures; resource allocation; Xilinx Virtex FPGA; automatic synthesis; branch-optimized designs; data flow architectures; flow control scheme; hardware compilation; optimization design space; program branch probability; progressive refinement radiosity; video feature extraction; Computer architecture; Design optimization; Feature extraction; Field programmable gate arrays; Hardware; Performance analysis; Queueing analysis; Reconfigurable architectures; Resource management; Runtime; 65; Index Terms- Automatic synthesis; dataflow architectures; queuing theory.;
         
        
        
            Journal_Title : 
Computers, IEEE Transactions on