• DocumentCode
    2627022
  • Title

    Selecting benchmark combinations for the evaluation of multicore throughput

  • Author

    Velasquez, Ricardo A. ; Michaud, Pierre ; Seznec, Andre

  • Author_Institution
    INRIA/IRISA, Rennes, France
  • fYear
    2013
  • fDate
    21-23 April 2013
  • Firstpage
    173
  • Lastpage
    182
  • Abstract
    Most high-performance processors today are able to execute multiple threads of execution simultaneously. Threads share processor resources, like the last-level cache, which may decrease throughput in a non obvious way, depending on threads´ characteristics. Computer architects usually study multiprogrammed workloads by considering a set of benchmarks and some combinations of these benchmarks. Because detailed microarchitecture simulators are slow, we want a subset of combinations that is as small as possible, yet representative. However, there is no standard method for selecting such sample, and different authors have used different methods. It is not clear how the choice of a particular sample impacts the conclusions of a study. We propose and compare different sampling methods for defining multiprogrammed workloads for computer architecture studies. We evaluate their effectiveness with a case study, the comparison of several multicore last-level cache replacement policies. We show that random sampling, the simplest method, is a possible way to define a representative workload sample, provided the sample is large enough. We propose a method for estimating the required sample size based on fast approximate simulation. We propose a new method, workload stratification, which is very effective at reducing the sample size in situations where random sampling would require large samples.
  • Keywords
    cache storage; multiprocessing systems; multiprogramming; parallel architectures; performance evaluation; random processes; sampling methods; benchmark combination selection; computer architecture; fast approximate simulation; high-performance processors; microarchitecture simulator; multicore last-level cache replacement policies; multicore throughput evaluation; multiple thread execution; multiprogrammed workloads; processor resource sharing; random sampling; sample size; workload stratification; Benchmark testing; Electronics packaging; Measurement; Microarchitecture; Multicore processing; Sociology; Throughput;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Performance Analysis of Systems and Software (ISPASS), 2013 IEEE International Symposium on
  • Conference_Location
    Austin, TX
  • Print_ISBN
    978-1-4673-5776-0
  • Electronic_ISBN
    978-1-4673-5778-4
  • Type

    conf

  • DOI
    10.1109/ISPASS.2013.6557168
  • Filename
    6557168