The R*-tree: an efficient and robust access method for points and rectangles
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Combining fuzzy information from multiple systems (extended abstract)
PODS '96 Proceedings of the fifteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
An optimal algorithm for approximate nearest neighbor searching in fixed dimensions
An optimal algorithm for approximate nearest neighbor searching in fixed dimensions
Fast parallel similarity search in multimedia databases
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
The SR-tree: an index structure for high-dimensional nearest neighbor queries
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
A cost model for similarity queries in metric spaces
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Dimensionality reduction for similarity searching in dynamic databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
R-trees: a dynamic index structure for spatial searching
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Similarity Search in High Dimensions via Hashing
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Interactive-Time Similarity Search for Large Image Collections Using Parallel VA-Files
ECDL '00 Proceedings of the 4th European Conference on Research and Advanced Technology for Digital Libraries
Effective Management of Hierarchical Storage Using Two Levels of Data Clustering
MSS '03 Proceedings of the 20 th IEEE/11 th NASA Goddard Conference on Mass Storage Systems and Technologies (MSS'03)
Approximate Retrieval with HiPeR: Application to VA-Hierarchies
MMM '09 Proceedings of the 15th International Multimedia Modeling Conference on Advances in Multimedia Modeling
A database-based framework for gesture recognition
Personal and Ubiquitous Computing
An approach to content-based image retrieval based on the Lucene search engine library
ECDL'10 Proceedings of the 14th European conference on Research and advanced technology for digital libraries
Embedding-based subsequence matching in time-series databases
ACM Transactions on Database Systems (TODS)
VA-files vs. r*-trees in distance join queries
ADBIS'05 Proceedings of the 9th East European conference on Advances in Databases and Information Systems
Semantics supervised cluster-based index for video databases
CIVR'06 Proceedings of the 5th international conference on Image and Video Retrieval
Hi-index | 0.00 |
In many situations, users would readily accept an approximate query result if evaluation of the query becomes faster. In this article, we investigate approximate evaluation techniques based on the VA-File for Nearest-Neighbor Search (NN-Search). The VA-File contains approximations of feature points. These approximations frequently suffice to eliminate the vast majority of points in a first phase. Then, a second phase identifies the NN by computing exact distances of all remaining points. To develop approximate query-evaluation techniques, we proceed in two steps: first, we derive an analytic model for VA-File based NN-search. This is to investigate the relationship between approximation granularity, effectiveness of the filtering step and search performance. In more detail, we develop formulae for the distribution of the error of the bounds and the duration of the different phases of query evaluation. Based on these results, we develop different approximate query evaluation techniques. The first one adapts the bounds to have a more rigid filtering, the second one skips computation of the exact distances. Experiments show that these techniques have the desired effect: for instance, when allowing for a small but specific reduction of result quality, we observed a speedup of 7 in 50-NN search.