• DocumentCode
    2736764
  • Title

    Partitional clustering of tick data to reduce storage space

  • Author

    Nagy, Gabor I. ; Buza, Krisztian

  • Author_Institution
    Dept. of Telecommun. & Media Inf., Budapest Univ. of Technol., Budapest, Hungary
  • fYear
    2012
  • fDate
    13-15 June 2012
  • Firstpage
    555
  • Lastpage
    560
  • Abstract
    Tick data is one of the most prominent types of temporal data, as it can be used to represent data in various domains such as geophysics or finance. Storage of tick data is a challenging problem because two criteria have to be fulfilled simultaneously: the storage structure should allow fast execution of queries and the data should not occupy too much space on the hard disk or in the main memory. In this paper, we present a clustering-based solution, and we introduce a new clustering algorithm, SOPAC, that is designed to support the storage of tick data. Our approach is based on the search for a partitional clustering that optimizes storage space. We evaluate our algorithm both on publicly available real-world datasets, as well as real-world tick data from the financial domain. We also investigate on task-specific benchmarks, how well our approach estimates the optimum. Our experiments show that, for the tick data storage problem, our algorithm substantially outperforms - both in terms of statistical significance and practical relevance - state-of-the-art clustering algorithms.
  • Keywords
    file organisation; optimisation; pattern clustering; query processing; SOPAC; clustering-based solution; financial domain; geophysics; partitional clustering; queries execution; storage space optimization; storage space reduction; temporal data; tick data; Algorithm design and analysis; Benchmark testing; Clustering algorithms; Matrix decomposition; Partitioning algorithms; Switches; Temperature measurement;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Engineering Systems (INES), 2012 IEEE 16th International Conference on
  • Conference_Location
    Lisbon
  • Print_ISBN
    978-1-4673-2694-0
  • Electronic_ISBN
    978-1-4673-2693-3
  • Type

    conf

  • DOI
    10.1109/INES.2012.6249896
  • Filename
    6249896