Distance-based indexing for high-dimensional metric spaces
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
The SR-tree: an index structure for high-dimensional nearest neighbor queries
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Optimal multi-step k-nearest neighbor search
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Data structures and algorithms for nearest neighbor search in general metric spaces
SODA '93 Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms
The Grid File: An Adaptable, Symmetric Multikey File Structure
ACM Transactions on Database Systems (TODS)
Distance browsing in spatial databases
ACM Transactions on Database Systems (TODS)
Multidimensional binary search trees used for associative searching
Communications of the ACM
The Earth Mover's Distance as a Metric for Image Retrieval
International Journal of Computer Vision
The K-D-B-tree: a search structure for large multidimensional dynamic indexes
SIGMOD '81 Proceedings of the 1981 ACM SIGMOD international conference on Management of data
Searching in metric spaces with user-defined and approximate distances
ACM Transactions on Database Systems (TODS)
Efficient Retrieval of Similar Time Sequences Under Time Warping
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Similarity Indexing with the SS-tree
ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Similarity Search without Tears: The OMNI Family of All-purpose Access Methods
Proceedings of the 17th International Conference on Data Engineering
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Fast Nearest Neighbor Search in Medical Image Databases
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
An Index-Based Approach for Similarity Search Supporting Time Warping in Large Sequence Databases
Proceedings of the 17th International Conference on Data Engineering
A Metric for Distributions with Applications to Image Databases
ICCV '98 Proceedings of the Sixth International Conference on Computer Vision
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision
Indexing High-Dimensional Data for Efficient In-Memory Similarity Search
IEEE Transactions on Knowledge and Data Engineering
iDistance: An adaptive B+-tree based indexing method for nearest neighbor search
ACM Transactions on Database Systems (TODS)
Approximation Techniques for Indexing the Earth Mover's Distance in Multimedia Databases
ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Reference-based indexing of sequence databases
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Earth mover distance over high-dimensional spaces
Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Optimal incremental multi-step nearest-neighbor search
Proceedings of the 16th ACM SIGSPATIAL international conference on Advances in geographic information systems
Generalizing the optimality of multi-step k-nearest neighbor query processing
SSTD'07 Proceedings of the 10th international conference on Advances in spatial and temporal databases
Hi-index | 12.05 |
The development of techniques that facilitate effective similarity search is important for many applications such as multi-media databases, content-based image retrieval, molecular biology, medical imaging, and object recognition, among others. Two of the common operations in this context are range queries and k-nearest neighbor search in high-dimensional space. However, the distance measures used to determine the dissimilarities between high-dimensional feature vectors are often expensive to compute. To reduce the number of expensive distance calculations in the search process, Korn, Sidiropoulos, Faloutsos, Siegel, and Protopapas (1996) proposed a multi-step algorithm, which involves two stages: filtering and refinement. It employs an easily computable lower-bound distance measure to filter out a candidate set in the filtering stage and confine the expensive distance computation to a small candidate set in the refinement stage. This algorithm was later improved by Seidl and Kriegel (1998) to produce optimal-sized candidate set in the filtering stage; the improved algorithm is said to be filtering optimal. However, the improved algorithm cannot produce the result incrementally in the refinement stage. The improved algorithm can only start to produce results after the whole search process stops, which is a disadvantage in real applications. In this paper, we experimentally demonstrate the applicability and effectiveness of an extended version of the algorithm that can produce the nearest neighbors incrementally in an optimal way in the sense that a nearest neighbor is output as soon as it can be determined using the existing information; thus, nearest neighbors are produced in order. Our algorithm is both filtering and refinement optimal, and well serves real applications. We have already proved the optimality of the proposed extended algorithm (Zhang, Alhajj, & Rokne, 2008), and in here we empirically demonstrate its independence on the number of nearest neighbors and its effectiveness in early retrieving results as compared to the previous algorithm.