The Active Vertice method: a performant filtering approach to high-dimensional indexing

Authors:
Sören Balko;Ingo Schmitt;Gunter Saake
Affiliations:
Department of Computer Science, University of Magdeburg, Universitätsplatz 2, D-39106 Magdeburg, Germany;Department of Computer Science, University of Magdeburg, Universitätsplatz 2, D-39106 Magdeburg, Germany;Department of Computer Science, University of Magdeburg, Universitätsplatz 2, D-39106 Magdeburg, Germany
Venue:
Data & Knowledge Engineering
Year:
2004

Citing 39
Cited 7

An efficient branch-and-bound nearest neighbour classifier

Pattern Recognition Letters
Introduction to algorithms

Introduction to algorithms
The R*-tree: an efficient and robust access method for points and rectangles

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Color indexing

International Journal of Computer Vision
Distance to an ellipsoid

Graphics gems IV
Nearest neighbor queries

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Numerical analysis: mathematics of scientific computing (2nd ed)

Numerical analysis: mathematics of scientific computing (2nd ed)
The SR-tree: an index structure for high-dimensional nearest neighbor queries

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
A cost model for nearest neighbor search in high-dimensional data space

PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
The Grid File: An Adaptable, Symmetric Multikey File Structure

ACM Transactions on Database Systems (TODS)
MOSAIC: a fast multi-feature image retrieval system

Data & Knowledge Engineering
Vector approximation based indexing for non-uniform high dimensional data sets

Proceedings of the ninth international conference on Information and knowledge management
The Quadtree and Related Hierarchical Data Structures

ACM Computing Surveys (CSUR)
Adaptive nearest neighbor search for relevance feedback in large image databases

MULTIMEDIA '01 Proceedings of the ninth ACM international conference on Multimedia
Numerical Recipes in C++: the art of scientific computing

Numerical Recipes in C++: the art of scientific computing
R-trees: a dynamic index structure for spatial searching

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
The TV-tree: an index structure for high-dimensional data

The VLDB Journal — The International Journal on Very Large Data Bases - Spatial Database Systems
Indexing the Solution Space: A New Technique for Nearest Neighbor Search in High-Dimensional Space

IEEE Transactions on Knowledge and Data Engineering
Similarity Indexing with the SS-tree

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
Similarity Search without Tears: The OMNI Family of All-purpose Access Methods

Proceedings of the 17th International Conference on Data Engineering
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Improving Adaptable Similarity Query Processing by Using Approximations

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Optimal Grid-Clustering: Towards Breaking the Curse of Dimensionality in High-Dimensional Clustering

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
The R+-Tree: A Dynamic Index for Multi-Dimensional Objects

VLDB '87 Proceedings of the 13th International Conference on Very Large Data Bases
What Is the Nearest Neighbor in High Dimensional Spaces?

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
The A-tree: An Index Structure for High-Dimensional Spaces Using Relative Approximation

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Contrast Plots and P-Sphere Trees: Space vs. Time in Nearest Neighbour Searches

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Local Dimensionality Reduction: A New Approach to Indexing High Dimensional Spaces

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Indexing the Distance: An Efficient Method to KNN Processing

Proceedings of the 27th International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Efficient User-Adaptable Similarity Search in Large Multimedia Databases

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Ranking in Spatial Databases

SSD '95 Proceedings of the 4th International Symposium on Advances in Spatial Databases
Combining multi-visual features for efficient indexing in a large image database

The VLDB Journal — The International Journal on Very Large Data Bases
Independent Quantization: An Index Compression Technique for High-Dimensional Data Spaces

ICDE '00 Proceedings of the 16th International Conference on Data Engineering
High-dimensional computational geometry

High-dimensional computational geometry
An efficient indexing method for nearest neighbor searches inhigh-dirnensional image databases

IEEE Transactions on Multimedia
The GC-tree: a high-dimensional index structure for similarity search in image databases

IEEE Transactions on Multimedia

Filter ranking in high-dimensional space

Data & Knowledge Engineering
Efficient filtering with sketches in the ferret toolkit

MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
Sizing sketches: a rank-based analysis for similarity search

Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Multi-probe LSH: efficient indexing for high-dimensional similarity search

VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Relation-collapse: an optimisation technique for the similarity algebra SA

ADBIS'05 Proceedings of the 9th East European conference on Advances in Databases and Information Systems
A service-oriented grid infrastructure for multimedia management and search

DELOS'04 Proceedings of the 6th Thematic conference on Peer-to-Peer, Grid, and Service-Orientation in Digital Library Architectures
DuoWave: Mitigating the curse of dimensionality for uncertain data

Data & Knowledge Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

The problem of finding nearest neighbors has emerged as an important foundation of feature-based similarity search in multimedia databases. Most spatial index structures based on the R-tree have failed to efficiently support nearest neighbor search in arbitrarily distributed high-dimensional data sets. In contrast, the so-called filtering principle as represented by the popular VA-file has turned out to be a more promising approach. Query processing is based on a flat file of compact vector approximations. In a first stage, those approximations are sequentially scanned and filtered so that in a second stage the nearest neighbors can be determined from a relatively small fraction of the data set.In this paper, we propose the Active Vertice method as a novel filtering approach. As opposed to the VA-file, approximation regions are arranged in a quad-tree like structure. High-dimensional feature vectors are assigned to ellipsoidal approximation regions on different levels of the tree. A compact approximation of a vector corresponds to the path within the index from the root to the respective tree node. When compared to the VA-file, our method enhances the discriminatory power of the approximations while maintaining their compactness in terms of storage consumption. To demonstrate its effectiveness, we conduct extensive experiments with synthetic as well as real-life data and show the superiority of our method over existing filtering approaches.