A cost model for nearest neighbor search in high-dimensional data space
PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Hi-index | 0.00 |
We propose two new tree-based search algorithms for vector quantizers using an additively weighted distance measure, such as ECVQ (Chou, 1989). Both algorithms are based on a recursive space division technique, and use a bounding object at each node of the tree, in order to quickly eliminate subsets of the codebook during the search. The structure is more general than the k-d tree, and the algorithm performs an optimal search similar to the one analyzed in (Berchtold, 1997). We prove a theorem that defines the necessary and sufficient condition for any set of points to be a valid bounding object, i.e. to define a lossless pruning rule for the additively weighted euclidean distance. The first algorithm presented uses rectangles as bounding objects, and the other uses spheres.We experimentally compare our approach with another recent one (Johnson, 1996), and show that the new algorithm using bounding rectangles performs significantly better for medium and high bitrate coding (0.4 bits/sample) of a Gaussian process. This algorithm uses approximately 29 times less multiplications than a full codebook search at 1 bits/sample.