Parallel Dynamic Batch Loading in the M-tree

  • Authors:
  • Jakub Lokoc

  • Affiliations:
  • -

  • Venue:
  • SISAP '09 Proceedings of the 2009 Second International Workshop on Similarity Search and Applications
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Although metric access methods (MAMs) proved their capabilities when performing efficient similarity search, their further performance improvement is needed due to extreme growth of data volumes. Since multi core processors become widely available, it is justified to exploit parallelism. However, taking into account the Gustafson’s law, it is necessary to find tasks suitable for parallelization. Such a task could be M-tree construction. Unfortunately, parallelism during an object insertion in hierarchical index structures is limited by a node capacity. It is much less restrictive to run several independent insertions in parallel. However, synchronization problems occur whenever a node is about to split. In this paper we present our new technique of M-tree construction. The technique postpones splitting of overfull nodes and thus allows simple parallelization of M-tree construction. We also utilize an adaptation of recently introduced re-inserting technique in the M-tree. Our experiments confirm the new technique guarantees significant speed up of M-tree construction and also improves the quality of the index.