• DocumentCode
    2973253
  • Title

    Multi-dimensional Histograms with Tight Bounds for the Error

  • Author

    Baltrunas, Linas ; Mazeika, Arturas ; Böhlen, Michael

  • Author_Institution
    Free Univ. of Bozen-Bolzano
  • fYear
    2006
  • fDate
    Dec. 2006
  • Firstpage
    105
  • Lastpage
    112
  • Abstract
    Histograms are being used as non-parametric selectivity estimators for one-dimensional data. For high-dimensional data it is common to either compute one-dimensional histograms for each attribute or to compute a multi-dimensional equi-width histogram for a set of attributes. This either yields small low-quality or large high-quality histograms. In this paper we introduce HIRED (high-dimensional histograms with dimensionality reduction): small high-quality histograms for multi-dimensional data. HIRED histograms are adaptive, and they are based on the shape error and directional splits. The shape error permits a precise control of the estimation error of the histogram and, together with directional splits, yields a memory complexity that does not depend on the number of uniform attributes in the dataset. We provide extensive experimental results with synthetic and real world datasets. The experiments confirm that our method is as precise as state-of-the-art techniques and uses orders of magnitude less memory
  • Keywords
    data handling; statistical analysis; directional split; memory complexity; multidimensional histograms; nonparametric selectivity estimator; shape error; Data engineering; Data mining; Data structures; Databases; Error correction; Estimation error; Histograms; Multidimensional systems; Query processing; Shape control;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Database Engineering and Applications Symposium, 2006. IDEAS '06. 10th International
  • Conference_Location
    Delhi
  • ISSN
    1098-8068
  • Print_ISBN
    0-7695-2577-6
  • Type

    conf

  • DOI
    10.1109/IDEAS.2006.31
  • Filename
    4041609