The R*-tree: an efficient and robust access method for points and rectangles
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Elements of information theory
Elements of information theory
Efficient and effective querying by image content
Journal of Intelligent Information Systems - Special issue: advances in visual information management systems
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Texture Features for Browsing and Retrieval of Image Data
IEEE Transactions on Pattern Analysis and Machine Intelligence
Combining fuzzy information from multiple systems (extended abstract)
PODS '96 Proceedings of the fifteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
The SR-tree: an index structure for high-dimensional nearest neighbor queries
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Advanced database systems
A cost model for nearest neighbor search in high-dimensional data space
PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
A cost model for similarity queries in metric spaces
PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Optimal multi-step k-nearest neighbor search
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Multidimensional access methods
ACM Computing Surveys (CSUR)
Data structures and algorithms for nearest neighbor search in general metric spaces
SODA '93 Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms
Distance browsing in spatial databases
ACM Transactions on Database Systems (TODS)
The String-to-String Correction Problem
Journal of the ACM (JACM)
Indexing large metric spaces for similarity search queries
ACM Transactions on Database Systems (TODS)
PREFER: a system for the efficient execution of multi-parametric ranked queries
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
ACM Computing Surveys (CSUR)
R-trees: a dynamic index structure for spatial searching
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Fast and Effective Retrieval of Medical Tumor Shapes
IEEE Transactions on Knowledge and Data Engineering
Supporting Ranked Boolean Similarity Queries in MARS
IEEE Transactions on Knowledge and Data Engineering
A Multistep Approach for Shape Similarity Search in Image Databases
IEEE Transactions on Knowledge and Data Engineering
Efficient Color Histogram Indexing for Quadratic Form Distance Functions
IEEE Transactions on Pattern Analysis and Machine Intelligence
Processing Complex Similarity Queries with Distance-Based Access Methods
EDBT '98 Proceedings of the 6th International Conference on Extending Database Technology: Advances in Database Technology
An Approach to Integrating Query Refinement in SQL
EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Querying with Intrinsic Preferences
EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Efficient Similarity Search In Sequence Databases
FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
Similarity Search without Tears: The OMNI Family of All-purpose Access Methods
Proceedings of the 17th International Conference on Data Engineering
Proceedings of the 17th International Conference on Data Engineering
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Improving Adaptable Similarity Query Processing by Using Approximations
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
MindReader: Querying Databases Through Multiple Examples
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Fast Time Sequence Indexing for Arbitrary Lp Norms
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Contrast Plots and P-Sphere Trees: Space vs. Time in Nearest Neighbour Searches
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Similarity Search for Adaptive Ellipsoid Queries Using Spatial Transformation
Proceedings of the 27th International Conference on Very Large Data Bases
FeedbackBypass: A New Approach to Interactive Similarity Query Processing
Proceedings of the 27th International Conference on Very Large Data Bases
Efficient Index Structures for String Databases
Proceedings of the 27th International Conference on Very Large Data Bases
VLDB '93 Proceedings of the 19th International Conference on Very Large Data Bases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
The Impact of Global Clustering on Spatial Database Systems
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Near Neighbor Search in Large Metric Spaces
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Efficient User-Adaptable Similarity Search in Large Multimedia Databases
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
The Hybrid Tree: An Index Structure for High Dimensional Feature Spaces
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Self-Adaptive User Profiles for Large-Scale Data Delivery
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Distance Exponent: A New Concept for Selectivity Estimation in Metric Trees
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
A Metric for Distributions with Applications to Image Databases
ICCV '98 Proceedings of the Sixth International Conference on Computer Vision
Preference SQL: design, implementation, experiences
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Retrieval by shape similarity with perceptual distance andeffective indexing
IEEE Transactions on Multimedia
Index-driven similarity search in metric spaces (Survey Article)
ACM Transactions on Database Systems (TODS)
WARP: Accurate Retrieval of Shapes Using Phase of Fourier Descriptors and Time Warping Distance
IEEE Transactions on Pattern Analysis and Machine Intelligence
KLEE: a framework for distributed top-k query algorithms
VLDB '05 Proceedings of the 31st international conference on Very large data bases
A multi-step strategy for approximate similarity search in image databases
ADC '06 Proceedings of the 17th Australasian Database Conference - Volume 49
Reverse Nearest Neighbor Search in Metric Spaces
IEEE Transactions on Knowledge and Data Engineering
Using high dimensional indexes to support relevance feedback based interactive images retrieval
VLDB '06 Proceedings of the 32nd international conference on Very large data bases
Dynamic similarity search in multi-metric spaces
MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
An efficient k nearest neighbor search for multivariate time series
Information and Computation
Warping the time on data streams
Data & Knowledge Engineering
Indexing schemes for similarity search in datasets of short protein fragments
Information Systems
Unified framework for fast exact and approximate search in dissimilarity spaces
ACM Transactions on Database Systems (TODS)
Top-k query evaluation with probabilistic guarantees
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Best position algorithms for top-k queries
VLDB '07 Proceedings of the 33rd international conference on Very large data bases
Dynamic skyline queries in metric spaces
EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Dynamic user-defined similarity searching in semi-structured text retrieval
Proceedings of the 3rd international conference on Scalable information systems
The Panda framework for comparing patterns
Data & Knowledge Engineering
Seamlessly integrating similarity queries in SQL
Software—Practice & Experience
The VLDB Journal — The International Journal on Very Large Data Bases
Speeding up spatial approximation search in metric spaces
Journal of Experimental Algorithmics (JEA)
Flexible multi-dimensional indexing server for searching non-textual diagnostic annotations
EuroIMSA '08 Proceedings of the IASTED International Conference on Internet and Multimedia Systems and Applications
Improving the performance of M-tree family by nearest-neighbor graphs
ADBIS'07 Proceedings of the 11th East European conference on Advances in databases and information systems
Effectiveness of optimal incremental multi-step nearest neighbor search
Expert Systems with Applications: An International Journal
Cover ratio of absolute neighbor: towards an index structure for efficient retrieval
WALCOM'08 Proceedings of the 2nd international conference on Algorithms and computation
CP-index: using clustering and pivots for indexing non-metric spaces
Proceedings of the Third International Conference on SImilarity Search and APplications
Subspace tree: high dimensional multimedia indexing with logarithmic temporal complexity
Journal of Intelligent Information Systems
Information Systems
On nonmetric similarity search problems in complex domains
ACM Computing Surveys (CSUR)
Best position algorithms for efficient top-k query processing
Information Systems
On the least cost for proximity searching in metric spaces
WEA'06 Proceedings of the 5th international conference on Experimental Algorithms
On fast non-metric similarity search by metric access methods
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
SC-tree: an efficient structure for high-dimensional data indexing
BNCOD'06 Proceedings of the 23rd British National Conference on Databases, conference on Flexible and Efficient Information Handling
Self-organising hierarchical retrieval in a case-agent system
ECCBR'06 Proceedings of the 8th European conference on Advances in Case-Based Reasoning
Processing preference queries in standard database systems
ADVIS'06 Proceedings of the 4th international conference on Advances in Information Systems
Adapting metric indexes for searching in multi-metric spaces
Multimedia Tools and Applications
Hi-index | 0.01 |
Novel database applications, such as multimedia, data mining, e-commerce, and many others, make intensive use of similarity queries in order to retrieve the objects that better fit a user request. Since the effectiveness of such queries improves when the user is allowed to personalize the similarity criterion according to which database objects are evaluated and ranked, the development of access methods able to efficiently support user-defined similarity queries becomes a basic requirement. In this article we introduce the first index structure, called the QIC-M-tree, that can process user-defined queries in generic metric spaces, that is, where the only information about indexed objects is their relative distances. The QIC-M-tree is a metric access method that can deal with several distinct distances at a time: (1) a query (user-defined) distance, (2) an index distance (used to build the tree), and (3) a comparison (approximate) distance (used to quickly discard from the search uninteresting parts of the tree). We develop an analytical cost model that accurately characterizes the performance of the QIC-M-tree and validate such model through extensive experimentation on real metric data sets. In particular, our analysis is able to predict the best evaluation strategy (i.e., which distances to use) under a variety of configurations, by properly taking into account relevant factors such as the distribution of distances, the cost of computing distances, and the actual index structure. We also prove that the overall saving in CPU search costs when using an approximate distance can be estimated by using information on the data set only (thus such measure is independent of the underlying access method) and show that performance results are closely related to a novel "indexing" error measure.