Multiple Similarity Queries: A Basic DBMS Operation for Mining in Metric Databases

Authors:
Bernhard Braunmüller;Martin Ester;Hans-Peter Kriegel;Jörg Sander
Affiliations:
-;-;-;-
Venue:
IEEE Transactions on Knowledge and Data Engineering
Year:
2001

Citing 23
Cited 6

The design and analysis of spatial data structures

The design and analysis of spatial data structures
The R*-tree: an efficient and robust access method for points and rectangles

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Numerical recipes in C (2nd ed.): the art of scientific computing

Numerical recipes in C (2nd ed.): the art of scientific computing
Efficient and effective querying by image content

Journal of Intelligent Information Systems - Special issue: advances in visual information management systems
Fast parallel similarity search in multimedia databases

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
From data mining to knowledge discovery: an overview

Advances in knowledge discovery and data mining
A cost model for nearest neighbor search in high-dimensional data space

PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Cost-based optimization of decision support queries using transient-views

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Multidimensional access methods

ACM Computing Surveys (CSUR)
Efficient and extensible algorithms for multi query optimization

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Machine Learning

Machine Learning
R-trees: a dynamic index structure for spatial searching

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Finding Aggregate Proximity Relationships and Commonalities in Spatial Data Mining

IEEE Transactions on Knowledge and Data Engineering
Efficient Color Histogram Indexing for Quadratic Form Distance Functions

IEEE Transactions on Pattern Analysis and Machine Intelligence
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Efficient User-Adaptable Similarity Search in Large Multimedia Databases

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Database Issues for Data Visualization: Supporting Interactive Database Explorartion

Proceedings of the IEEE Visualization '95 Workshop on Database Issues for Data Visualization
Discovery of Spatial Association Rules in Geographic Information Databases

SSD '95 Proceedings of the 4th International Symposium on Advances in Spatial Databases
Ranking in Spatial Databases

SSD '95 Proceedings of the 4th International Symposium on Advances in Spatial Databases
Efficiently Supporting Multiple Similarity Queries for Mining in Metric Databases

ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Efficient Query Refinement in Multimedia Databases

ICDE '00 Proceedings of the 16th International Conference on Data Engineering

Document Image Recognition Based on Template Matching of Component Block Projections

IEEE Transactions on Pattern Analysis and Machine Intelligence
Robust Identification of Fuzzy Duplicates

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Distributed computation of the knn graph for large high-dimensional point sets

Journal of Parallel and Distributed Computing
A Query-Driven Approach to the Design and Management of Flexible Database Systems

Journal of Management Information Systems
The use of semantic-based predicates implication to improve horizontal multimedia database fragmentation

Workshop on multimedia information retrieval on The many faces of multimedia semantics
Towards multimedia fragmentation

ADBIS'06 Proceedings of the 10th East European conference on Advances in Databases and Information Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Metric databases are databases where a metric distance function is defined for pairs of database objects. In such databases, similarity queries in the form of range queries or k-nearest-neighbor queries are the most important query types. In traditional query processing, single queries are issued independently by different users. In many data mining applications, however, the database is typically explored by iteratively asking similarity queries for answers of previous similarity queries. In this paper, we introduce a generic scheme for such data mining algorithms and we investigate two orthogonal approaches, reducing I/O cost as well as CPU cost, to speed-up the processing of multiple similarity queries. The proposed techniques apply to any type of similarity query and to an implementation based on an index or using a sequential scan. Parallelization yields an additional impressive speed-up. An extensive performance evaluation confirms the efficiency of our approach.