Distance-based indexing for high-dimensional metric spaces
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Making B+- trees cache conscious in main memory
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Optimizing multidimensional index trees for main memory access
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Main-memory index structures with fixed-size partial keys
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Improving index performance through prefetching
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
ACM Computing Surveys (CSUR)
The TV-tree: an index structure for high-dimensional data
The VLDB Journal — The International Journal on Very Large Data Bases - Spatial Database Systems
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Local Dimensionality Reduction: A New Approach to Indexing High Dimensional Spaces
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Indexing the Distance: An Efficient Method to KNN Processing
Proceedings of the 27th International Conference on Very Large Data Bases
Indexing High-Dimensional Data for Efficient In-Memory Similarity Search
IEEE Transactions on Knowledge and Data Engineering
Video Data Mining: Semantic Indexing and Event Detection from the Association Perspective
IEEE Transactions on Knowledge and Data Engineering
iDistance: An adaptive B+-tree based indexing method for nearest neighbor search
ACM Transactions on Database Systems (TODS)
Exploring bit-difference for approximate KNN search in high-dimensional databases
ADC '05 Proceedings of the 16th Australasian database conference - Volume 39
Self-tuning cost modeling of user-defined functions in an object-relational DBMS
ACM Transactions on Database Systems (TODS)
IEEE Transactions on Knowledge and Data Engineering
Indexing for function approximation
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Exploring composite acoustic features for efficient music similarity query
MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia
Distributed computation of the knn graph for large high-dimensional point sets
Journal of Parallel and Distributed Computing
Exploiting parallelism to support scalable hierarchical clustering
Journal of the American Society for Information Science and Technology
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Mining partial periodic correlations in time series
Knowledge and Information Systems
QUC-tree: integrating query context information for efficient music retrieval
IEEE Transactions on Multimedia - Special issue on integration of context and content
Indexing high-dimensional data for main-memory similarity search
Information Systems
iPoc: a polar coordinate based indexing method for nearest neighbor search in high dimensional space
WAIM'10 Proceedings of the 11th international conference on Web-age information management
ISIS: a new approach for efficient similarity search in sparse databases
DASFAA'10 Proceedings of the 15th international conference on Database Systems for Advanced Applications - Volume Part II
ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part II
Hi-index | 0.00 |
In this paper, we present a novel index structure, called Δ-tree, to speed up processing of high-dimensional K-nearest neighbor (KNN) queries in main memory environment. The Δ-tree is a multi-level structure where each level represents the data space at different dimensionalities: the number of dimensions increases towards the leaf level which contains the data at their full dimensions. The remaining dimensions are obtained using Principal Component Analysis, which has the desirable property that the first few dimensions capture most of the information in the dataset. Each level of the tree serves to prune the search space more efficiently as the reduced dimensions can better exploit the small cache line size. Moreover, the distance computation on lower dimensionality is less expensive. We also propose an extension, called Δ+-tree, that globally clusters the data space and then further partitions clusters into small regions to reduce the search space. We conducted extensive experiments to evaluate the proposed structures against existing techniques on different kinds of datasets. Our results show that the Δ+-tree is superior in most cases.