Approximate nearest neighbors: towards removing the curse of dimensionality
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
Join synopses for approximate query answering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Making B+- trees cache conscious in main memory
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Optimizing multidimensional index trees for main memory access
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
ACM Computing Surveys (CSUR)
Approximate Nearest Neighbor Searching in Multimedia Databases
Proceedings of the 17th International Conference on Data Engineering
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
The A-tree: An Index Structure for High-Dimensional Spaces Using Relative Approximation
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Contrast Plots and P-Sphere Trees: Space vs. Time in Nearest Neighbour Searches
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Local Dimensionality Reduction: A New Approach to Indexing High Dimensional Spaces
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Indexing the Distance: An Efficient Method to KNN Processing
Proceedings of the 27th International Conference on Very Large Data Bases
Contorting high dimensional data for efficient main memory KNN processing
Proceedings of the 2003 ACM SIGMOD international conference on Management of data
LDC: Enabling Search By Partial Distance In A Hyper-Dimensional Space
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Adaptive stream filters for entity-based queries with non-value tolerance
VLDB '05 Proceedings of the 31st international conference on Very large data bases
Distributed computation of the knn graph for large high-dimensional point sets
Journal of Parallel and Distributed Computing
Towards solving similarity search problems using fuzzy concept for multi-dimensional data
Proceedings of the 47th Annual Southeast Regional Conference
Towards improving a similarity search approach
Proceedings of the 48th Annual Southeast Regional Conference
Hi-index | 0.00 |
In this paper, we develop a novel index structure to support efficient approximate k-nearest neighbor (KNN) query in high-dimensional databases. In high-dimensional spaces, the computational cost of the distance (e.g., Euclidean distance) between two points contributes a dominant portion of the overall query response time for memory processing. To reduce the distance computation, we first propose a structure (BID) using BIt-Difference to answer approximate KNN query. The BID employs one bit to represent each feature vector of point and the number of bit-difference is used to prune the further points. To facilitate real dataset which is typically skewed, we enhance the BID mechanism with clustering, cluster adapted bitcoder and dimensional weight, named the BID+. Extensive experiments are conducted to show that our proposed method yields significant performance advantages over the existing index structures on both real life and synthetic high-dimensional datasets.