The R*-tree: an efficient and robust access method for points and rectangles
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
An algorithm for approximate closest-point queries
SCG '94 Proceedings of the tenth annual symposium on Computational geometry
The SR-tree: an index structure for high-dimensional nearest neighbor queries
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
An optimal algorithm for approximate nearest neighbor searching fixed dimensions
Journal of the ACM (JACM)
An optimal algorithm for approximate nearest neighbor searching
SODA '94 Proceedings of the fifth annual ACM-SIAM symposium on Discrete algorithms
Indexing large metric spaces for similarity search queries
ACM Transactions on Database Systems (TODS)
The TV-tree: an index structure for high-dimensional data
The VLDB Journal — The International Journal on Very Large Data Bases - Spatial Database Systems
Clustering for Approximate Similarity Search in High-Dimensional Spaces
IEEE Transactions on Knowledge and Data Engineering
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Similarity Search in High Dimensions via Hashing
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Near Neighbor Search in Large Metric Spaces
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Searching in Metric Spaces by Spatial Approximation
SPIRE '99 Proceedings of the String Processing and Information Retrieval Symposium & International Workshop on Groupware
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Enhanced Perceptual Distance Functions and Indexing for Image Replica Recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence
Hi-index | 0.00 |
Indexing high-dimensional data for efficient nearest-neighbor searches poses interesting research challenges. It is well known that when data dimension is high, the search time can exceed the time required for performing a linear scan on the entire dataset. To alleviate this dimensionality curse, indexing schemes such as locality sensitive hashing (LSH) and M-trees were proposed to perform approximate searches. In this paper, we propose a hypersphere indexer, named Hydex, to perform such searches. Hydex partitions the data space using concentric hyperspheres. By exploiting geometric properties, Hydex can perform effective pruning. Our empirical study shows that Hydex enjoys three advantages over competing schemes for achieving the same level of search accuracy. First, Hydex requires fewer seek operations. Second, Hydex can maintain sequential disk accesses most of the time. And third, it requires fewer distance computations.