• DocumentCode
    3144645
  • Title

    A novel probabilistic pruning approach to speed up similarity queries in uncertain databases

  • Author

    Bernecker, Thomas ; Emrich, Tobias ; Kriegel, Hans-Peter ; Mamoulis, Nikos ; Renz, Matthias ; Züfle, Andreas

  • Author_Institution
    Dept. of Comput. Sci., Ludwig-Maximilians-Univ. Munchen, Munich, Germany
  • fYear
    2011
  • fDate
    11-16 April 2011
  • Firstpage
    339
  • Lastpage
    350
  • Abstract
    In this paper, we propose a novel, effective and efficient probabilistic pruning criterion for probabilistic similarity queries on uncertain data. Our approach supports a general uncertainty model using continuous probabilistic density functions to describe the (possibly correlated) uncertain attributes of objects. In a nutshell, the problem to be solved is to compute the PDF of the random variable denoted by the probabilistic domination count: Given an uncertain database object B, an uncertain reference object R and a set D of uncertain database objects in a multi-dimensional space, the probabilistic domination count denotes the number of uncertain objects in D that are closer to R than B. This domination count can be used to answer a wide range of probabilistic similarity queries. Specifically, we propose a novel geometric pruning filter and introduce an iterative filter-refinement strategy for conservatively and progressively estimating the probabilistic domination count in an efficient way while keeping correctness according to the possible world semantics. In an experimental evaluation, we show that our proposed technique allows to acquire tight probability bounds for the probabilistic domination count quickly, even for large uncertain databases.
  • Keywords
    iterative methods; probability; query processing; uncertainty handling; very large databases; iterative filter-refinement strategy; large uncertain databases; probabilistic density functions; probabilistic domination count; probabilistic pruning approach; random variable; semantics; similarity queries; Approximation methods; Databases; Handheld computers; Nearest neighbor searches; Probabilistic logic; Random variables; Uncertainty;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering (ICDE), 2011 IEEE 27th International Conference on
  • Conference_Location
    Hannover
  • ISSN
    1063-6382
  • Print_ISBN
    978-1-4244-8959-6
  • Electronic_ISBN
    1063-6382
  • Type

    conf

  • DOI
    10.1109/ICDE.2011.5767908
  • Filename
    5767908