Efficient bulk-loading on dynamic metric access methods

  • Authors:
  • Thiago G. Vespa;Caetano Traina, Jr;Agma J. Traina

  • Affiliations:
  • ICMC - Institute of Mathematics and Computer Sciences, USP - University of São Paulo, Avenida do Trabalhador Sãocarlense, 400, Postal Code: 13566-590 - São Carlos, SP, Brazil;ICMC - Institute of Mathematics and Computer Sciences, USP - University of São Paulo, Avenida do Trabalhador Sãocarlense, 400, Postal Code: 13566-590 - São Carlos, SP, Brazil;ICMC - Institute of Mathematics and Computer Sciences, USP - University of São Paulo, Avenida do Trabalhador Sãocarlense, 400, Postal Code: 13566-590 - São Carlos, SP, Brazil

  • Venue:
  • Information Systems
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a new technique and two algorithms to bulk-load data into multi-way dynamic metric access methods, based on the covering radius of representative elements employed to organize data in hierarchical data structures. The proposed algorithms are sample-based, and they always build a valid and height-balanced tree. We compare the proposed algorithm with existing ones, showing the behavior to bulk-load data into the Slim-tree metric access method. After having identified the worst case of our first algorithm, we describe adequate counteractions in an elegant way creating the second algorithm. Experiments performed to evaluate their performance show that our bulk-loading methods build trees faster than the sequential insertion method regarding construction time, and that it also significantly improves search performance.