The Onion-Tree: Quick Indexing of Complex Data in the Main Memory

Authors:
Caio César Carélo;Ives Renê Pola;Ricardo Rodrigues Ciferri;Agma Juci Traina;Caetano Traina-Jr.;Cristina Dutra Aguiar Ciferri
Affiliations:
Departamento de Ciências de Computação, Universidade de São Paulo, São Carlos, Brazil 13.560-970;Departamento de Ciências de Computação, Universidade de São Paulo, São Carlos, Brazil 13.560-970;Departamento de Computação, Universidade Federal de São Carlos, São Carlos, Brazil 13.565-905;Departamento de Ciências de Computação, Universidade de São Paulo, São Carlos, Brazil 13.560-970;Departamento de Ciências de Computação, Universidade de São Paulo, São Carlos, Brazil 13.560-970;Departamento de Ciências de Computação, Universidade de São Paulo, São Carlos, Brazil 13.560-970
Venue:
ADBIS '09 Proceedings of the 13th East European Conference on Advances in Databases and Information Systems
Year:
2009

Citing 10
Cited 1

Data structures and algorithms for nearest neighbor search in general metric spaces

SODA '93 Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms
Some approaches to best-match file searching

Communications of the ACM
Searching in metric spaces

ACM Computing Surveys (CSUR)
Fast Indexing and Visualization of Metric Data Sets using Slim-Trees

IEEE Transactions on Knowledge and Data Engineering
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Near Neighbor Search in Large Metric Spaces

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Index-driven similarity search in metric spaces (Survey Article)

ACM Transactions on Database Systems (TODS)
The Omni-family of all-purpose access methods: a simple and effective way to make similarity search more efficient

The VLDB Journal — The International Journal on Very Large Data Bases
Improved heterogeneous distance functions

Journal of Artificial Intelligence Research
The MM-tree: a memory-based metric tree without overlap between nodes

ADBIS'07 Proceedings of the 11th East European conference on Advances in databases and information systems

Slicing the metric space to provide quick indexing of complex data in the main memory

Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Searching for elements in a dataset that are similar to a given query element is a core problem in applications that use complex data, and has been carried out aided by a metric access method (MAM). A growing number of these applications require indices that can be built faster and for several times, in addition to providing smaller response times for similarity queries. Besides, the increase in the main memory capacity and its lowering costs also motivate using memory-based MAMs. In this paper, we propose the Onion-tree , a new and robust dynamic memory-based MAM that performs a hierarchical division of the metric space into disjoint subspaces. The Onion-tree is very compact, requiring a small fraction of the main memory (e.g., at most 4.8%). Comparisons of the Onion-tree , a memory-based version of the Slim-tree, and the memory-based MM-tree showed that the Onion-tree always produced the smallest elapsed time to build the index. Our experiments also showed that the Onion-tree produced the best query performance results, followed by the MM-tree, which in turn outperformed the Slim-tree. With regard to the MM-tree, the Onion-tree provided a reduction in the number of distance calculations that ranged from 1% to 11% in range queries and from 16% up to 64% in k -NN queries. The Onion-tree also significantly improved the required elapsed time, which ranged from 12% to 39% in range query processing and from 40% up to 70% in k -NN query processing, as compared to the MM-tree, its closest competitor. The Onion-tree source code is available at http://gbd.dc.ufscar.br/download/Onion-tree .