Title :
A Multi-Agent Framework for Thermal Aware Task Migration in Many-Core Systems
Author :
Ge, Yang ; Qiu, Qinru ; Wu, Qing
Author_Institution :
Dept. of Electr. Eng. & Comput. Sci., Syracuse Univ., Syracuse, NY, USA
Abstract :
In deep submicrometer era, thermal hot spots, and large temperature gradients significantly impact system reliability, performance, cost, and leakage power. As the system complexity increases, it is more and more difficult to perform thermal management in a centralized manner because of state explosion and the overhead of monitoring the entire chip. In this paper, we propose a framework for distributed thermal management in many-core systems where balanced thermal profile can be achieved by proactive task migration among neighboring cores. The framework has a low cost agent residing in each core that observes the local workload and temperature and communicates with its nearest neighbor for task migration and exchange. By choosing only those migration requests that will result in balanced workload without generating thermal emergency, the proposed framework maintains workload balance across the system and avoids unnecessary migration. Experimental results show that, our distributed management policy achieves almost the same performance as a global management policy when the tasks are initially randomly distributed. Compared with existing proactive task migration technique, our approach generates less hotspot, less migration overhead with negligible performance overhead.
Keywords :
microprocessor chips; multi-agent systems; multiprocessing systems; performance evaluation; power aware computing; reliability; thermal management (packaging); balanced thermal profile; chip monitoring; deep submicrometer era; distributed management policy; distributed thermal management; global management policy; local workload; many-core systems; multiagent framework; proactive task migration technique; state explosion; system cost; system leakage power; system performance; system reliability; task exchange; temperature gradients; thermal aware task migration; thermal emergency; thermal hot spots; Distributed control; Multiagent systems; Predictive models; Thermal management; Distributed control; dynamic thermal management; multi-agent; prediction; task migration;
Journal_Title :
Very Large Scale Integration (VLSI) Systems, IEEE Transactions on
DOI :
10.1109/TVLSI.2011.2162348