• DocumentCode
    85776
  • Title

    Toward a Scalable Working Set Size Estimation Method and Its Application for Chip Multiprocessors

  • Author

    Dani, Aparna Mandke ; Amrutur, Bharadwaj ; Srikant, Y.N.

  • Author_Institution
    Indian Inst. of Sci., Bangalore, India
  • Volume
    63
  • Issue
    6
  • fYear
    2014
  • fDate
    Jun-14
  • Firstpage
    1567
  • Lastpage
    1579
  • Abstract
    It is essential to accurately estimate the working set size (WSS) of an application for various optimizations such as to partition cache among virtual machines or reduce leakage power dissipated in an over-allocated cache by switching it off. However, the state-of-the-art heuristics such as average memory access latency (AMAL) or cache miss ratio (CMR) are poorly correlated to the WSS of an application due to 1) over-sized caches and 2) their dispersed nature. Past studies focus on estimating WSS of an application executing on a uniprocessor platform. Estimating the same for a chip multiprocessor (CMP) with a large dispersed cache is challenging due to the presence of concurrently executing threads/processes. Hence, we propose a scalable, highly accurate method to estimate WSS of an application. We call this method “tagged WSS (TWSS)” estimation method. We demonstrate the use of TWSS to switch-off the over-allocated cache ways in Static and Dynamic NonUniform Cache Architectures (SNUCA, DNUCA) on a tiled CMP. In our implementation of adaptable way SNUCA and DNUCA caches, decision of altering associativity is taken by each L2 controller. Hence,this approach scales better with the number of cores present on a CMP. It gives overall (geometric mean) 26% and 19% higher energy-delay product savings compared to AMAL and CMR heuristics on SNUCA, respectively.
  • Keywords
    cache storage; memory architecture; multiprocessing systems; virtual machines; AMAL; CMP; CMR; DNUCA caches; L2 controller; SNUCA caches; TWSS estimation method; average memory access latency; cache miss ratio; chip multiprocessors; concurrently executing processes; concurrently executing threads; dynamic nonuniform cache architectures; over-sized caches; scalable working set size estimation method; static nonuniform cache architectures; tagged WSS estimation method; uniprocessor platform; virtual machines; Clocks; Estimation; Monitoring; Radiation detectors; Random access memory; Switches; Chip multiprocessors (CMPs); variable cache associativity; working set size (WSS) estimation;
  • fLanguage
    English
  • Journal_Title
    Computers, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9340
  • Type

    jour

  • DOI
    10.1109/TC.2012.291
  • Filename
    6375705