• DocumentCode
    1362454
  • Title

    Frequent Item Computation on a Chip

  • Author

    Teubner, Jens ; Muller, Rudolf ; Alonso, Gustavo

  • Author_Institution
    Dept. of Comput. Sci., ETH Zurich, Zurich, Switzerland
  • Volume
    23
  • Issue
    8
  • fYear
    2011
  • Firstpage
    1169
  • Lastpage
    1181
  • Abstract
    Computing frequent items is an important problem by itself and as a subroutine in several data mining algorithms. In this paper, we explore how to accelerate the computation of frequent items using field-programmable gate arrays (FPGAs) with a threefold goal: increase performance over existing solutions, reduce energy consumption over CPU-based systems, and explore the design space in detail as the constraints on FPGAs are very different from those of traditional software-based systems. We discuss three design alternatives, each one of them exploiting different FPGA features and each one providing different performance/scalability trade-offs. An important result of the paper is to demonstrate how the inherent massive parallelism of FPGAs can improve performance of existing algorithms but only after a fundamental redesign of the algorithms. Our experimental results show that, e.g., the pipelined solution we introduce can reach more than 100 million tuples per second of sustained throughput (four times the best available results to date) by making use of techniques that are not available to CPU-based solutions. Moreover, and unlike in software approaches, the high throughput is independent of the skew of the Zipf distribution of the input and at a far lower energy cost.
  • Keywords
    data mining; energy consumption; field programmable gate arrays; CPU based systems; Zipf distribution; data mining algorithms; energy consumption reductin; field programmable gate arrays; frequent item computation; performance-scalability trade-offs; pipelined solution; Algorithm design and analysis; Field programmable gate arrays; Monitoring; Random access memory; Software; Table lookup; Throughput; Data mining; parallelism and concurrency.; reconfigurable hardware;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2010.216
  • Filename
    5611528