DocumentCode
2993130
Title
Accelerating multi-core processor design space evaluation using automatic multi-threaded workload synthesis
Author
Hughes, Clay ; Li, Tao
Author_Institution
Dept. of Electr. & Comput. Eng., Univ. of Florida, Gainesville, FL
fYear
2008
fDate
14-16 Sept. 2008
Firstpage
163
Lastpage
172
Abstract
The design and evaluation of microprocessor architectures is a difficult and time-consuming task. Although small, hand-coded microbenchmarks can be used to accelerate performance evaluation, these programs lack the complexity to stress increasingly complex architecture designs. Larger and more complex real-world workloads should be employed to measure the performance of a given design or to evaluate the efficiency of various design alternatives. These applications can take days or weeks if run to completion on a detailed architecture simulator. In the past, researchers have applied machine learning and statistical sampling methods to reduce the average number of instructions required for detailed simulation. Others have proposed statistical simulation and workload synthesis techniques, which can produce programs that emulate the execution characteristics of the application from which they are derived but have a much shorter execution period than the original. However, these existing methods are difficult to apply to multi-threaded programs and can result in simplifications that miss the complex interactions between multiple, concurrently running threads. This study focuses on developing new techniques for accurate and effective multi-threaded workload synthesis, which can significantly accelerate architecture design evaluation of multi-core processors. We propose to construct synchronized statistical flow graphs that incorporate inter-thread synchronization and sharing behavior to capture the complex characteristics and interactions of multiple threads. Moreover, we develop thread-aware data reference models and wavelet-based branching models to generate accurate memory access and dynamic branch statistics. Experimental results show that a framework integrated with the aforementioned models can automatically generate synthetic programs that maintain characteristics of original workloads but have significantly reduced runtime.
Keywords
computer architecture; learning (artificial intelligence); logic design; microprocessor chips; multi-threading; performance evaluation; sampling methods; wavelet transforms; automatic multi-threaded workload synthesis; machine learning; microprocessor architectures; multi-core processor design space evaluation; performance evaluation; statistical sampling methods; thread-aware data reference models; wavelet-based branching models; Acceleration; Flow graphs; Machine learning; Microprocessors; Multicore processing; Process design; Sampling methods; Statistics; Stress; Yarn;
fLanguage
English
Publisher
ieee
Conference_Titel
Workload Characterization, 2008. IISWC 2008. IEEE International Symposium on
Conference_Location
Seattle, WA
Print_ISBN
978-1-4244-2777-2
Electronic_ISBN
978-1-4244-2778-9
Type
conf
DOI
10.1109/IISWC.2008.4636101
Filename
4636101
Link To Document