Effectiveness of NAQ-tree as index structure for similarity search in high-dimensional metric space

Authors:
Ming Zhang;Reda Alhajj
Affiliations:
University of Calgary, Department of Computer Science, Calgary, AB, Canada;University of Calgary, Department of Computer Science, Calgary, AB, Canada and Global University, Department of Computer Science, Beirut, Lebanon
Venue:
Knowledge and Information Systems
Year:
2010

Citing 26
Cited 4

Distance-based indexing for high-dimensional metric spaces

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
The SR-tree: an index structure for high-dimensional nearest neighbor queries

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Optimal multi-step k-nearest neighbor search

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Data structures and algorithms for nearest neighbor search in general metric spaces

SODA '93 Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms
The Grid File: An Adaptable, Symmetric Multikey File Structure

ACM Transactions on Database Systems (TODS)
Some approaches to best-match file searching

Communications of the ACM
Rough Sets: Theoretical Aspects of Reasoning about Data

Rough Sets: Theoretical Aspects of Reasoning about Data
Slim-Trees: High Performance Metric Trees Minimizing Overlap Between Nodes

EDBT '00 Proceedings of the 7th International Conference on Extending Database Technology: Advances in Database Technology
Similarity Indexing with the SS-tree

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Similarity Search without Tears: The OMNI Family of All-purpose Access Methods

Proceedings of the 17th International Conference on Data Engineering
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Near Neighbor Search in Large Metric Spaces

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Dynamic vp-tree indexing for n-nearest neighbor search given pair-wise distances

The VLDB Journal — The International Journal on Very Large Data Bases
D-Index: Distance Searching Index for Metric Data Sets

Multimedia Tools and Applications
Distinctive Image Features from Scale-Invariant Keypoints

International Journal of Computer Vision
CVA file: an index structure for high-dimensional datasets

Knowledge and Information Systems
Indexing High-Dimensional Data for Efficient In-Memory Similarity Search

IEEE Transactions on Knowledge and Data Engineering
iDistance: An adaptive B+-tree based indexing method for nearest neighbor search

ACM Transactions on Database Systems (TODS)
A compact space decomposition for effective metric indexing

Pattern Recognition Letters
Extending metric index structures for efficient range query processing

Knowledge and Information Systems
Reference-based indexing of sequence databases

VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions

Communications of the ACM - 50th anniversary issue: 1958 - 2008
A compact multi-resolution index for variable length queries in time series databases

Knowledge and Information Systems
When is nearest neighbors indexable?

ICDT'05 Proceedings of the 10th international conference on Database Theory

Effective monitoring by efficient fingerprint matching using a forest of NAQ-trees

Journal of Intelligent Information Systems
Efficient content-based image retrieval using Multiple Support Vector Machines Ensemble

Expert Systems with Applications: An International Journal
A partitioning method for high dimensional data

Proceedings of the 4th International Conference on Uniquitous Information Management and Communication
Integrating wavelets with clustering and indexing for effective content-based image retrieval

Knowledge-Based Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Similarity search (e.g., k-nearest neighbor search) in high-dimensional metric space is the key operation in many applications, such as multimedia databases, image retrieval and object recognition, among others. The high dimensionality and the huge size of the data set require an index structure to facilitate the search. State-of-the-art index structures are built by partitioning the data set based on distances to certain reference point(s). Using the index, search is confined to a small number of partitions. However, these methods either ignore the property of the data distribution (e.g., VP-tree and its variants) or produce non-disjoint partitions (e.g., M-tree and its variants, DBM-tree); these greatly affect the search efficiency. In this paper, we study the effectiveness of a new index structure, called Nested-Approximate-eQuivalence-class tree (NAQ-tree), which overcomes the above disadvantages. NAQ-tree is constructed by recursively dividing the data set into nested approximate equivalence classes. The conducted analysis and the reported comparative test results demonstrate the effectiveness of NAQ-tree in significantly improving the search efficiency.