• DocumentCode
    1663134
  • Title

    An architecture for fast processing of large unstructured data sets

  • Author

    Franklin, Mark ; Chamberlain, Roger ; Henrichs, Michael ; Shands, Berkley ; White, Jason

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Washington Univ., St. Louis, MO, USA
  • fYear
    2004
  • Firstpage
    280
  • Lastpage
    287
  • Abstract
    This paper presents a general system architecture tailored to perform searching, filtering, compression, encryption, and other operations on unstructured data streaming from a disk system. The system achieves high performance on such applications by providing for parallelism, hardware-application specialization and reconfiguration, and hardware placement near the disk systems. A limited prototype of a single compute node has been implemented and is described. The prototype is tailored to applications involving complex searching and its performance is compared to a pure software implementation having the same search capabilities. Performance is considered in terms of data set size, query string hit rate and query complexity. Performance results as a function of these parameters are presented and the results indicate that, for data set sizes above 1.4 MB, the prototype compute node is between one and two orders of magnitude faster than a pure software implementation. At high data set sizes, on an individual node, speedups of about 200 and a sustained throughput of 300 MB/sec have been achieved.
  • Keywords
    magnetic disc storage; parallel architectures; reconfigurable architectures; data set size; disk systems; large unstructured data sets; parallel architecture; query complexity; query string hit rate; reconfigurable architectures; system architecture; Application software; Computer architecture; Cryptography; Filtering; Hardware; Parallel processing; Prototypes; Software performance; Software prototyping; Throughput;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Design: VLSI in Computers and Processors, 2004. ICCD 2004. Proceedings. IEEE International Conference on
  • ISSN
    1063-6404
  • Print_ISBN
    0-7695-2231-9
  • Type

    conf

  • DOI
    10.1109/ICCD.2004.1347934
  • Filename
    1347934