• DocumentCode
    2081227
  • Title

    Estimating the compression fraction of an index using sampling

  • Author

    Idreos, Stratos ; Kaushik, Raghav ; Narasayya, Vivek ; Ramamurthy, Ravishankar

  • Author_Institution
    CWI, Amsterdam, Netherlands
  • fYear
    2010
  • fDate
    1-6 March 2010
  • Firstpage
    441
  • Lastpage
    444
  • Abstract
    Data compression techniques such as null suppression and dictionary compression are commonly used in today´s database systems. In order to effectively leverage compression, it is necessary to have the ability to efficiently and accurately estimate the size of an index if it were to be compressed. Such an analysis is critical if automated physical design tools are to be extended to handle compression. Several database systems today provide estimators for this problem based on random sampling. While this approach is efficient, there is no previous work that analyses its accuracy. In this paper, we analyse the problem of estimating the compressed size of an index from the point of view of worst-case guarantees. We show that the simple estimator implemented by several database systems has several ¿good¿ cases even though the estimator itself is agnostic to the internals of the specific compression algorithm.
  • Keywords
    data analysis; data compression; database management systems; dictionaries; estimation theory; sampling methods; compression algorithm; data compression techniques; database systems; dictionary compression; estimators; index compression fraction estimation; null suppression; random sampling; Algorithm design and analysis; Capacity planning; Costs; Data analysis; Data compression; Database systems; Indexes; Performance analysis; Sampling methods; Yield estimation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering (ICDE), 2010 IEEE 26th International Conference on
  • Conference_Location
    Long Beach, CA
  • Print_ISBN
    978-1-4244-5445-7
  • Electronic_ISBN
    978-1-4244-5444-0
  • Type

    conf

  • DOI
    10.1109/ICDE.2010.5447871
  • Filename
    5447871