Unified framework for fast exact and approximate search in dissimilarity spaces

Authors:
Tomáš Skopal
Affiliations:
Charles University in Prague, Czech Republic
Venue:
ACM Transactions on Database Systems (TODS)
Year:
2007

Citing 51
Cited 24

Hardware Algorithms for Determining Similarity Between two Strings

IEEE Transactions on Computers
Beyond uniformity and independence: analysis of R-trees using the concept of fractal dimension

PODS '94 Proceedings of the thirteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
An optimal algorithm for approximate nearest neighbor searching fixed dimensions

Journal of the ACM (JACM)
Data structures and algorithms for nearest neighbor search in general metric spaces

SODA '93 Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms
Similarity Measures

IEEE Transactions on Pattern Analysis and Machine Intelligence
Indexing large metric spaces for similarity search queries

ACM Transactions on Database Systems (TODS)
The IGrid index: reversing the dimensionality curse for similarity indexing in high dimensional space

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Classification with Nonmetric Distances: Image Retrieval and Class Representation

IEEE Transactions on Pattern Analysis and Machine Intelligence
The Earth Mover's Distance as a Metric for Image Retrieval

International Journal of Computer Vision
Searching in metric spaces

ACM Computing Surveys (CSUR)
Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases

ACM Computing Surveys (CSUR)
Modern Information Retrieval

Modern Information Retrieval
Empirical evaluation of dissimilarity measures for color and texture

Computer Vision and Image Understanding - Special issue on empirical evaluation of computer vision algorithms
Searching in metric spaces with user-defined and approximate distances

ACM Transactions on Database Systems (TODS)
Clustering for Approximate Similarity Search in High-Dimensional Spaces

IEEE Transactions on Knowledge and Data Engineering
Comparing Images Using the Hausdorff Distance

IEEE Transactions on Pattern Analysis and Machine Intelligence
Fast Nearest-Neighbor Search in Dissimilarity Spaces

IEEE Transactions on Pattern Analysis and Machine Intelligence
DynDex: a dynamic and non-metric space indexer

Proceedings of the tenth ACM international conference on Multimedia
VQ-index: an index structure for similarity searching in multimedia databases

Proceedings of the tenth ACM international conference on Multimedia
Efficient Retrieval of Similar Time Sequences Under Time Warping

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Similarity Search without Tears: The OMNI Family of All-purpose Access Methods

Proceedings of the 17th International Conference on Data Engineering
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Contrast Plots and P-Sphere Trees: Space vs. Time in Nearest Neighbour Searches

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Near Neighbor Search in Large Metric Spaces

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Robust Face Detection Using the Hausdorff Distance

AVBPA '01 Proceedings of the Third International Conference on Audio- and Video-Based Biometric Person Authentication
Region proximity in metric spaces and its use for approximate similarity search

ACM Transactions on Information Systems (TOIS)
Approximate similarity retrieval with M-trees

The VLDB Journal — The International Journal on Very Large Data Bases
Properties of Embedding Methods for Similarity Searching in Metric Spaces

IEEE Transactions on Pattern Analysis and Machine Intelligence
Buoy indexing of metric feature spaces for fast approximate image queries

Proceedings of the sixth Eurographics workshop on Multimedia 2001
Sparse Representations for Image Decomposition with Occlusions

CVPR '96 Proceedings of the 1996 Conference on Computer Vision and Pattern Recognition (CVPR '96)
M+-tree: a new dynamical multidimensional index for metric spaces

ADC '03 Proceedings of the 14th Australasian database conference - Volume 17
D-Index: Distance Searching Index for Metric Data Sets

Multimedia Tools and Applications
Pivot selection techniques for proximity searching in metric spaces

Pattern Recognition Letters
Probabilistic proximity searching algorithms based on compact partitions

Journal of Discrete Algorithms - SPIRE 2002
WARP: Accurate Retrieval of Shapes Using Phase of Fourier Descriptors and Time Warping Distance

IEEE Transactions on Pattern Analysis and Machine Intelligence
Exact indexing of dynamic time warping

Knowledge and Information Systems
Query-sensitive embeddings

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
A pivot-based index structure for combination of feature vectors

Proceedings of the 2005 ACM symposium on Applied computing
Texture-Based Image Retrieval for Computerized Tomography Databases

CBMS '05 Proceedings of the 18th IEEE Symposium on Computer-Based Medical Systems
Feature-based similarity search in 3D object databases

ACM Computing Surveys (CSUR)
Similarity Search: The Metric Space Approach (Advances in Database Systems)

Similarity Search: The Metric Space Approach (Advances in Database Systems)
Evaluating Dataflow and Pipelined Vector Processing Architectures for FPGA Co-processors

DSD '06 Proceedings of the 9th EUROMICRO Conference on Digital System Design
Image Tangent Space for Image Retrieval

ICPR '06 Proceedings of the 18th International Conference on Pattern Recognition - Volume 02
Dynamic similarity search in multi-metric spaces

MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
Multiresolution Wavelet Transform and Supervised Learning for Content-Based Image Retrieval

ICMCS '99 Proceedings of the IEEE International Conference on Multimedia Computing and Systems - Volume 2
On fast non-metric similarity search by metric access methods

EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Fractional distance measures for content-based image retrieval

ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
Nearest neighbours search using the PM-Tree

DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications
Learning similarity measure for natural image retrieval with relevance feedback

IEEE Transactions on Neural Networks

NM-Tree: Flexible Approximate Similarity Search in Metric and Non-metric Spaces

DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
Approximate similarity search: A multi-faceted problem

Journal of Discrete Algorithms
Content-based video copy detection

MM '09 Proceedings of the 17th ACM international conference on Multimedia
On Fuzzy vs. Metric Similarity Search in Complex Databases

FQAS '09 Proceedings of the 8th International Conference on Flexible Query Answering Systems
Where are you heading, metric access methods?: a provocative survey

Proceedings of the Third International Conference on SImilarity Search and APplications
CP-index: using clustering and pivots for indexing non-metric spaces

Proceedings of the Third International Conference on SImilarity Search and APplications
Improving the similarity search of tandem mass spectra using metric access methods

Proceedings of the Third International Conference on SImilarity Search and APplications
On (not) indexing quadratic form distance by metric access methods

Proceedings of the 14th International Conference on Extending Database Technology
On nonmetric similarity search problems in complex domains

ACM Computing Surveys (CSUR)
Indexing the signature quadratic form distance for efficient content-based multimedia retrieval

Proceedings of the 1st ACM International Conference on Multimedia Retrieval
Automatic weight selection for multi-metric distances

Proceedings of the Fourth International Conference on SImilarity Search and APplications
Fuzzy approach to non-metric similarity indexing

Proceedings of the Fourth International Conference on SImilarity Search and APplications
Applying similarity search for the investigation of the fuel injection process

Proceedings of the Fourth International Conference on SImilarity Search and APplications
On metric approximations of the SProt measure

Proceedings of the Fourth International Conference on SImilarity Search and APplications
Protein sequences identification using NM-tree

Proceedings of the Fourth International Conference on SImilarity Search and APplications
Non-metric similarity search of tandem mass spectra including posttranslational modifications

Journal of Discrete Algorithms
On optimizing the non-metric similarity search in tandem mass spectra by clustering

ISBRA'12 Proceedings of the 8th international conference on Bioinformatics Research and Applications
Algorithmic exploration of axiom spaces for efficient similarity search at large scale

SISAP'12 Proceedings of the 5th international conference on Similarity Search and Applications
Cut-Region: a compact building block for hierarchical metric indexing

SISAP'12 Proceedings of the 5th international conference on Similarity Search and Applications
Revisiting techniques for lowerbounding the dynamic time warping distance

SISAP'12 Proceedings of the 5th international conference on Similarity Search and Applications
Efficient indexing of similarity models with inequality symbolic regression

Proceedings of the 15th annual conference on Genetic and evolutionary computation
Towards efficient indexing of arbitrary similarity: vision paper

ACM SIGMOD Record
Universal indexing of arbitrary similarity models

Proceedings of the VLDB Endowment
Optimized dissimilarity space embedding for labeled graphs

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

In multimedia systems we usually need to retrieve database (DB) objects based on their similarity to a query object, while the similarity assessment is provided by a measure which defines a (dis)similarity score for every pair of DB objects. In most existing applications, the similarity measure is required to be a metric, where the triangle inequality is utilized to speed up the search for relevant objects by use of metric access methods (MAMs), for example, the M-tree. A recent research has shown, however, that nonmetric measures are more appropriate for similarity modeling due to their robustness and ease to model a made-to-measure similarity. Unfortunately, due to the lack of triangle inequality, the nonmetric measures cannot be directly utilized by MAMs. From another point of view, some sophisticated similarity measures could be available in a black-box nonanalytic form (e.g., as an algorithm or even a hardware device), where no information about their topological properties is provided, so we have to consider them as nonmetric measures as well. From yet another point of view, the concept of similarity measuring itself is inherently imprecise and we often prefer fast but approximate retrieval over an exact but slower one. To date, the mentioned aspects of similarity retrieval have been solved separately, that is, exact versus approximate search or metric versus nonmetric search. In this article we introduce a similarity retrieval framework which incorporates both of the aspects into a single unified model. Based on the framework, we show that for any dissimilarity measure (either a metric or nonmetric) we are able to change the “amount” of triangle inequality, and so obtain an approximate or full metric which can be used for MAM-based retrieval. Due to the varying “amount” of triangle inequality, the measure is modified in a way suitable for either an exact but slower or an approximate but faster retrieval. Additionally, we introduce the TriGen algorithm aimed at constructing the desired modification of any black-box distance automatically, using just a small fraction of the database.