• DocumentCode
    703920
  • Title

    Quality configurable reduce-and-rank for energy efficient approximate computing

  • Author

    Raha, Arnab ; Venkataramani, Swagath ; Raghunathan, Vijay ; Raghunathan, Anand

  • fYear
    2015
  • fDate
    9-13 March 2015
  • Firstpage
    665
  • Lastpage
    670
  • Abstract
    Approximate computing is an emerging design paradigm that exploits the intrinsic ability of applications to produce acceptable outputs even when their computations are executed approximately. In this work, we explore approximate computing for a key computation pattern, Reduce-and-Rank (RnR), which is prevalent in a wide range of workloads including video processing, recognition, search and data mining. An RnR kernel performs a reduction operation (e.g., distance computation, dot product, L1-norm) between an input vector and each of a set of reference vectors, and ranks the reduction outputs to select the top reference vectors for the current input. We propose two complementary approximation strategies for the RnR computation pattern. The first is interleaved reduction-and-ranking, wherein the vector reductions are decomposed into multiple partial reductions and interleaved with the rank computation. Leveraging this transformation, we propose the use of intermediate reduction results and ranks to identify future computations that are likely to have low impact on the output, and can hence be approximated. The second strategy, input similarity based approximation, exploits the spatial or temporal correlation of inputs (e.g., pixels of an image or frames of a video) to identify computations that are amenable to approximation. These strategies address a key challenge in approximate computing - identification of which computations to approximate - and may be used to drive any approximation mechanism such as computation skipping and precision scaling to realize performance or energy improvements. A second key challenge in approximate computing is that the extent to which computations can be approximated varies significantly from application to application, and across inputs for even a single application. Hence, quality configurability, or the ability to automatically modulate the degree of approximation at runtime is essential. To enable quality configurability in RnR ker- els, we propose a kernel-level quality metric that correlates well to application-level quality, and identify key parameters that can be used to tune the proposed approximation strategies dynamically. We develop a runtime framework that modulates the identified parameters during execution of RnR kernels to minimize their energy while meeting a given target quality. To evaluate the proposed concepts, we designed quality-configurable hardware implementations of 6 RnR-based applications from the recognition, mining, search and video processing application domains in 45nm technology. Our experiments demonstrate 1.06X-2.18X reduction in energy consumption with virtually no loss in output quality (<;0.5%) at the application-level. The energy benefits further improve up to 2.38X and 2.5X when the quality constraints are relaxed to 2.5% and 5% respectively.
  • Keywords
    approximation theory; data mining; pattern recognition; search problems; RnR computation pattern; application-level quality; computation skipping; energy efficient approximate computing; interleaved reduction-and-ranking; kernel-level quality metric; precision scaling; quality configurable reduce-and-rank; search and data mining; video processing; video recognition; Approximation algorithms; Approximation methods; Calibration; Correlation; Kernel; Training;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Design, Automation & Test in Europe Conference & Exhibition (DATE), 2015
  • Conference_Location
    Grenoble
  • Print_ISBN
    978-3-9815-3704-8
  • Type

    conf

  • Filename
    7092472