Hypersphere indexer

Authors:
Navneet Panda;Edward Y. Chang;Arun Qamra
Affiliations:
University of California, Santa Barbara, CA;University of California, Santa Barbara, CA;University of California, Santa Barbara, CA
Venue:
DEXA'06 Proceedings of the 17th international conference on Database and Expert Systems Applications
Year:
2006

Citing 16
Cited 0

The R*-tree: an efficient and robust access method for points and rectangles

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
An algorithm for approximate closest-point queries

SCG '94 Proceedings of the tenth annual symposium on Computational geometry
The SR-tree: an index structure for high-dimensional nearest neighbor queries

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
An optimal algorithm for approximate nearest neighbor searching fixed dimensions

Journal of the ACM (JACM)
An optimal algorithm for approximate nearest neighbor searching

SODA '94 Proceedings of the fifth annual ACM-SIAM symposium on Discrete algorithms
Indexing large metric spaces for similarity search queries

ACM Transactions on Database Systems (TODS)
The TV-tree: an index structure for high-dimensional data

The VLDB Journal — The International Journal on Very Large Data Bases - Spatial Database Systems
Clustering for Approximate Similarity Search in High-Dimensional Spaces

IEEE Transactions on Knowledge and Data Engineering
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Similarity Search in High Dimensions via Hashing

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Near Neighbor Search in Large Metric Spaces

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Searching in Metric Spaces by Spatial Approximation

SPIRE '99 Proceedings of the String Processing and Information Retrieval Symposium & International Workshop on Groupware
PAC Nearest Neighbor Queries: Approximate and Controlled Search in High-Dimensional and Metric Spaces

ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Enhanced Perceptual Distance Functions and Indexing for Image Replica Recognition

IEEE Transactions on Pattern Analysis and Machine Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Indexing high-dimensional data for efficient nearest-neighbor searches poses interesting research challenges. It is well known that when data dimension is high, the search time can exceed the time required for performing a linear scan on the entire dataset. To alleviate this dimensionality curse, indexing schemes such as locality sensitive hashing (LSH) and M-trees were proposed to perform approximate searches. In this paper, we propose a hypersphere indexer, named Hydex, to perform such searches. Hydex partitions the data space using concentric hyperspheres. By exploiting geometric properties, Hydex can perform effective pruning. Our empirical study shows that Hydex enjoys three advantages over competing schemes for achieving the same level of search accuracy. First, Hydex requires fewer seek operations. Second, Hydex can maintain sequential disk accesses most of the time. And third, it requires fewer distance computations.