The ND-tree: a dynamic indexing technique for multidimensional non-ordered discrete data spaces

  • Authors:
  • Gang Qian;Qiang Zhu;Qiang Xue;Sakti Pramanik

  • Affiliations:
  • Department of Computer Science and Engineering, Michigan State University, East Lansing, MI;Department of Computer and Information Science, The University of Michigan - Dearborn, Dearborn, MI;Department of Computer Science and Engineering, Michigan State University, East Lansing, MI;Department of Computer Science and Engineering, Michigan State University, East Lansing, MI

  • Venue:
  • VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Similarity searches in multidimensional Nonordered Discrete Data Spaces (NDDS) are becoming increasingly important for application areas such as genome sequence databases. Existing indexing methods developed for multidimensional (ordered) Continuous Data Spaces (CDS) such as R-tree cannot be directly applied to an NDDS. This is because some essential geometric concepts/properties such as the minimum bounding region and the area of a region in a CDS are no longer valid in an NDDS. On the other hand, indexing methods based on metric spaces such as M-tree are too general to effectively utilize the data distribution characteristics in an NDDS. Therefore, their retrieval performance is not optimized. To support efficient similarity searches in an NDDS, we propose a new dynamic indexing technique, called the ND-tree. The key idea is to extend the relevant geometric concepts as well as some indexing strategies used in CDSs to NDDSs. Efficient algorithms for ND-tree construction are presented. Our experimental results on synthetic and genomic sequence data demonstrate that the performance of the ND-tree is significantly better than that of the linear scan and M-tree in high dimensional NDDSs.