Title of article :
Slicing the metric space to provide quick indexing of complex data in the main memory
Author/Authors :
Caio César Mori Carélo، نويسنده , , Ives Renê Venturini Pola، نويسنده , , Ricardo Rodrigues Ciferri، نويسنده , , Agma Juci Machado Traina، نويسنده , , Caetano Traina Jr.، نويسنده , , Cristina Dutra de Aguiar Ciferri، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2011
Abstract :
Searching in a dataset for elements that are similar to a given query element is a core problem in applications that manage complex data, and has been aided by metric access methods (MAMs). A growing number of applications require indices that must be built faster and repeatedly, also providing faster response for similarity queries. The increase in the main memory capacity and its lowering costs also motivate using memory-based MAMs. In this paper, we propose the Onion-tree, a new and robust dynamic memory-based MAM that slices the metric space into disjoint subspaces to provide quick indexing of complex data. It introduces three major characteristics: (i) a partitioning method that controls the number of disjoint subspaces generated at each node; (ii) a replacement technique that can change the leaf node pivots in insertion operations; and (iii) range and k-NN extended query algorithms to support the new partitioning method, including a new visit order of the subspaces in k-NN queries. Performance tests with both real-world and synthetic datasets showed that the Onion-tree is very compact. Comparisons of the Onion-tree with the MM-tree and a memory-based version of the Slim-tree showed that the Onion-tree was always faster to build the index. The experiments also showed that the Onion-tree significantly improved range and k-NN query processing performance and was the most efficient MAM, followed by the MM-tree, which in turn outperformed the Slim-tree in almost all the tests.
Keywords :
Metric access method , Complex data , Similarity search , Memory-based indexing
Journal title :
Information Systems
Journal title :
Information Systems