Dynamic indexing for multidimensional non-ordered discrete data spaces using a data-partitioning approach

Authors:
Gang Qian;Qiang Zhu;Qiang Xue;Sakti Pramanik
Affiliations:
University of Central Oklahoma, Edmond, OK;The University of Michigan---Dearborn, Dearborn, MI;Michigan State University, East Lansing, MI;Michigan State University, East Lansing, MI
Venue:
ACM Transactions on Database Systems (TODS)
Year:
2006

Citing 22
Cited 7

The R*-tree: an efficient and robust access method for points and rectangles

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Nearest neighbor queries

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Distance-based indexing for high-dimensional metric spaces

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
The SR-tree: an index structure for high-dimensional nearest neighbor queries

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
The string B-tree: a new data structure for string search in external memory and its applications

Journal of the ACM (JACM)
Prefix B-trees

ACM Transactions on Database Systems (TODS)
Searching in metric spaces

ACM Computing Surveys (CSUR)
The K-D-B-tree: a search structure for large multidimensional dynamic indexes

SIGMOD '81 Proceedings of the 1981 ACM SIGMOD international conference on Management of data
R-trees: a dynamic index structure for spatial searching

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Fast Indexing and Visualization of Metric Data Sets using Slim-Trees

IEEE Transactions on Knowledge and Data Engineering
Similarity Indexing with the SS-tree

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
The LSDh-Tree: An Access Structure for Feature Vectors

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Content-Based Image Indexing

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Near Neighbor Search in Large Metric Spaces

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
M+-tree: a new dynamical multidimensional index for metric spaces

ADC '03 Proceedings of the 14th Australasian database conference - Volume 17
The Hybrid Tree: An Index Structure for High Dimensional Feature Spaces

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
D-Index: Distance Searching Index for Metric Data Sets

Multimedia Tools and Applications
Efficient similarity search based on data distribution properties in high dimensions

Efficient similarity search based on data distribution properties in high dimensions
A space-partitioning-based indexing method for multidimensional non-ordered discrete data spaces

ACM Transactions on Information Systems (TOIS)

A space-partitioning-based indexing method for multidimensional non-ordered discrete data spaces

ACM Transactions on Information Systems (TOIS)
Space-Partitioning-Based Bulk-Loading for the NSP-Tree in Non-ordered Discrete Data Spaces

DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
The C-ND tree: a multidimensional index for hybrid continuous and non-ordered discrete data spaces

Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology
Efficient k-nearest neighbor searching in nonordered discrete data spaces

ACM Transactions on Information Systems (TOIS)
Reducing non-determinism of k-NN searching in non-ordered discrete data spaces

Information Processing Letters
RSM-based gossip on P2P network

ICA3PP'07 Proceedings of the 7th international conference on Algorithms and architectures for parallel processing
Bulk-loading the ND-tree in non-ordered discrete data spaces

DASFAA'08 Proceedings of the 13th international conference on Database systems for advanced applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Similarity searches in multidimensional Non-ordered Discrete Data Spaces (NDDS) are becoming increasingly important for application areas such as bioinformatics, biometrics, data mining and E-commerce. Efficient similarity searches require robust indexing techniques. Unfortunately, existing indexing methods developed for multidimensional (ordered) Continuous Data Spaces (CDS) such as the R-tree cannot be directly applied to an NDDS. This is because some essential geometric concepts/properties such as the minimum bounding region and the area of a region in a CDS are no longer valid in an NDDS. Other indexing methods based on metric spaces such as the M-tree and the Slim-trees are too general to effectively utilize the special characteristics of NDDSs, resulting in nonoptimized performance. In this article, we propose a new dynamic data-partitioning-based indexing technique, called the ND-tree, to support efficient similarity searches in an NDDS. The key idea is to extend the relevant geometric concepts as well as some indexing strategies used in CDSs to NDDSs. Efficient algorithms for ND-tree construction and techniques to solve relevant issues such as handling dimensions with different alphabets in an NDDS are presented. Our experimental results on synthetic data and real genome sequence data demonstrate that the ND-tree outperforms the linear scan, the M-tree and the Slim-trees for similarity searches in multidimensional NDDSs. A theoretical model is also developed to predict the performance of the ND-tree for random data.