• DocumentCode
    769298
  • Title

    Access Structures for Angular Similarity Queries

  • Author

    Apaydin, Tan ; Ferhatosmanoglu, Hakan

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH
  • Volume
    18
  • Issue
    11
  • fYear
    2006
  • Firstpage
    1512
  • Lastpage
    1525
  • Abstract
    Angular similarity measures have been utilized by several database applications to define semantic similarity between various data types such as text documents, time-series, images, and scientific data. Although similarity searches based on Euclidean distance have been extensively studied in the database community, processing of angular similarity searches has been relatively untouched. Problems due to a mismatch in the underlying geometry as well as the high dimensionality of the data make current techniques either inapplicable or their use results in poor performance. This brings up the need for effective indexing methods for angular similarity queries. We first discuss how to efficiently process such queries and propose effective access structures suited to angular similarity measures. In particular, we propose two classes of access structures, namely, angular-sweep and cone-shell, which perform different types of quantization based on the angular orientation of the data objects. We also develop query processing algorithms that utilize these structures as dense indices. The proposed techniques are shown to be scalable with respect to both dimensionality and the size of the data. Our experimental results on real data sets from various applications show two to three orders of magnitude of speedup over the current techniques
  • Keywords
    database indexing; query processing; angular similarity queries; angular-sweep access structure; cone-shell access structure; database applications; indexing methods; query processing algorithms; query search; semantic similarity; Computer graphics; Euclidean distance; Extraterrestrial measurements; Geometry; Image databases; Indexing; Light sources; Quantization; Query processing; Spatial databases; Angular query; angular similarity measures; high-dimensional data.; indexing; performance;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2006.165
  • Filename
    1704803