Trading Quality for Time with Nearest Neighbor Search

Authors:
Roger Weber;Klemens Böhm
Affiliations:
-;-
Venue:
EDBT '00 Proceedings of the 7th International Conference on Extending Database Technology: Advances in Database Technology
Year:
2000

Citing 12
Cited 9

The R*-tree: an efficient and robust access method for points and rectangles

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Combining fuzzy information from multiple systems (extended abstract)

PODS '96 Proceedings of the fifteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
An optimal algorithm for approximate nearest neighbor searching in fixed dimensions

An optimal algorithm for approximate nearest neighbor searching in fixed dimensions
Fast parallel similarity search in multimedia databases

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
The SR-tree: an index structure for high-dimensional nearest neighbor queries

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
A cost model for similarity queries in metric spaces

PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Dimensionality reduction for similarity searching in dynamic databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
R-trees: a dynamic index structure for spatial searching

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Similarity Search in High Dimensions via Hashing

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases

Interactive-Time Similarity Search for Large Image Collections Using Parallel VA-Files

ECDL '00 Proceedings of the 4th European Conference on Research and Advanced Technology for Digital Libraries
Effective Management of Hierarchical Storage Using Two Levels of Data Clustering

MSS '03 Proceedings of the 20 th IEEE/11 th NASA Goddard Conference on Mass Storage Systems and Technologies (MSS'03)
Approximate Retrieval with HiPeR: Application to VA-Hierarchies

MMM '09 Proceedings of the 15th International Multimedia Modeling Conference on Advances in Multimedia Modeling
A database-based framework for gesture recognition

Personal and Ubiquitous Computing
An approach to content-based image retrieval based on the Lucene search engine library

ECDL'10 Proceedings of the 14th European conference on Research and advanced technology for digital libraries
Embedding-based subsequence matching in time-series databases

ACM Transactions on Database Systems (TODS)
VA-files vs. r*-trees in distance join queries

ADBIS'05 Proceedings of the 9th East European conference on Advances in Databases and Information Systems
Semantics supervised cluster-based index for video databases

CIVR'06 Proceedings of the 5th international conference on Image and Video Retrieval
Multimedia search and retrieval using multimodal annotation propagation and indexing techniques

Image Communication

Quantified Score

Hi-index	0.00

Visualization

Abstract

In many situations, users would readily accept an approximate query result if evaluation of the query becomes faster. In this article, we investigate approximate evaluation techniques based on the VA-File for Nearest-Neighbor Search (NN-Search). The VA-File contains approximations of feature points. These approximations frequently suffice to eliminate the vast majority of points in a first phase. Then, a second phase identifies the NN by computing exact distances of all remaining points. To develop approximate query-evaluation techniques, we proceed in two steps: first, we derive an analytic model for VA-File based NN-search. This is to investigate the relationship between approximation granularity, effectiveness of the filtering step and search performance. In more detail, we develop formulae for the distribution of the error of the bounds and the duration of the different phases of query evaluation. Based on these results, we develop different approximate query evaluation techniques. The first one adapts the bounds to have a more rigid filtering, the second one skips computation of the exact distances. Experiments show that these techniques have the desired effect: for instance, when allowing for a small but specific reduction of result quality, we observed a speedup of 7 in 50-NN search.