The Onion-Tree: Quick Indexing of Complex Data in the Main Memory

  • Authors:
  • Caio César Carélo;Ives Renê Pola;Ricardo Rodrigues Ciferri;Agma Juci Traina;Caetano Traina-Jr.;Cristina Dutra Aguiar Ciferri

  • Affiliations:
  • Departamento de Ciências de Computação, Universidade de São Paulo, São Carlos, Brazil 13.560-970;Departamento de Ciências de Computação, Universidade de São Paulo, São Carlos, Brazil 13.560-970;Departamento de Computação, Universidade Federal de São Carlos, São Carlos, Brazil 13.565-905;Departamento de Ciências de Computação, Universidade de São Paulo, São Carlos, Brazil 13.560-970;Departamento de Ciências de Computação, Universidade de São Paulo, São Carlos, Brazil 13.560-970;Departamento de Ciências de Computação, Universidade de São Paulo, São Carlos, Brazil 13.560-970

  • Venue:
  • ADBIS '09 Proceedings of the 13th East European Conference on Advances in Databases and Information Systems
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Searching for elements in a dataset that are similar to a given query element is a core problem in applications that use complex data, and has been carried out aided by a metric access method (MAM). A growing number of these applications require indices that can be built faster and for several times, in addition to providing smaller response times for similarity queries. Besides, the increase in the main memory capacity and its lowering costs also motivate using memory-based MAMs. In this paper, we propose the Onion-tree , a new and robust dynamic memory-based MAM that performs a hierarchical division of the metric space into disjoint subspaces. The Onion-tree is very compact, requiring a small fraction of the main memory (e.g., at most 4.8%). Comparisons of the Onion-tree , a memory-based version of the Slim-tree, and the memory-based MM-tree showed that the Onion-tree always produced the smallest elapsed time to build the index. Our experiments also showed that the Onion-tree produced the best query performance results, followed by the MM-tree, which in turn outperformed the Slim-tree. With regard to the MM-tree, the Onion-tree provided a reduction in the number of distance calculations that ranged from 1% to 11% in range queries and from 16% up to 64% in k -NN queries. The Onion-tree also significantly improved the required elapsed time, which ranged from 12% to 39% in range query processing and from 40% up to 70% in k -NN query processing, as compared to the MM-tree, its closest competitor. The Onion-tree source code is available at http://gbd.dc.ufscar.br/download/Onion-tree .