Title : 
L1-bandwidth aware thread allocation in multicore SMT processors
         
        
            Author : 
Sasaki, Hiroshi ; Imamura, Satoshi ; Inoue, Koji
         
        
            Author_Institution : 
Kyushu Univ., Fukuoka, Japan
         
        
        
        
        
        
            Abstract : 
Optimizing the performance in multiprogrammed environments, especially for workloads composed of multi-threaded programs is a desired feature of runtime management system in future manycore processors. At the same time, power capping capability is required in order to improve the reliability of microprocessor chips while reducing the costs of power supply and thermal budgeting. This paper presents a sophisticated runtime coordinated power-performance management system called C-3PO, which optimizes the performance of manycore processors under a power constraint by controlling two software knobs: thread packing, and dynamic voltage and frequency scaling (DVFS). The proposed solution distributes the power budget to each program by controlling the workload threads to be executed with appropriate number of cores and operating frequency. The power budget is distributed carefully in different forms (number of allocated cores or operating frequency) depending on the power-performance characteristics of the workload so that each program can effectively convert the power into performance. The proposed system is based on a heuristic algorithm which relies on runtime prediction of power and performance via hardware performance monitoring units. Empirical results on a 64-core platform show that C-3PO well outperforms traditional counterparts across various PARSEC workload mixes.
         
        
            Keywords : 
microprocessor chips; multi-threading; multiprocessing systems; power aware computing; 64-core platform; C-3PO; DVFS; PARSEC workload mixes; coordinated power-performance optimization; dynamic voltage and frequency scaling; hardware performance monitoring units; heuristic algorithm; manycore processors; microprocessor chips reliability; multiprogrammed environments; multithreaded programs; power budget; power capping capability; power constraint; power supply cost reduction; power-performance characteristics; runtime coordinated power-performance management system; runtime management system; runtime prediction; software knobs; thermal budgeting; thread packing; workload threads; Linux; Mathematical model; Optimization; Power demand; Program processors; Radio spectrum management; Runtime; SMT; bandwidth-aware scheduling; thread allocation;
         
        
        
        
            Conference_Titel : 
Parallel Architectures and Compilation Techniques (PACT), 2013 22nd International Conference on
         
        
            Conference_Location : 
Edinburgh
         
        
        
            Print_ISBN : 
978-1-4799-1018-2
         
        
        
            DOI : 
10.1109/PACT.2013.6618803