On nonmetric similarity search problems in complex domains

Authors:
Tomáš Skopal;Benjamin Bustos
Affiliations:
Charles University in Prague, Czech Republic;University of Chile, Chile
Venue:
ACM Computing Surveys (CSUR)
Year:
2011

Citing 95
Cited 11

An algorithm for finding nearest neighbours in (approximately) constant average time

Pattern Recognition Letters
Hardware Algorithms for Determining Similarity Between two Strings

IEEE Transactions on Computers
The R*-tree: an efficient and robust access method for points and rectangles

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
A new version of the nearest-neighbour approximating and eliminating search algorithm (AESA) with linear preprocessing time and memory requirements

Pattern Recognition Letters
Beyond uniformity and independence: analysis of R-trees using the concept of fractal dimension

PODS '94 Proceedings of the thirteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Fast subsequence matching in time-series databases

SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Latent semantic indexing: a probabilistic analysis

PODS '98 Proceedings of the seventeenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Computational Methods for Intelligent Information Access

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Understanding search engines: mathematical modeling and text retrieval

Understanding search engines: mathematical modeling and text retrieval
Similarity Measures

IEEE Transactions on Pattern Analysis and Machine Intelligence
Closest pair queries in spatial databases

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
The IGrid index: reversing the dimensionality curse for similarity indexing in high dimensional space

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Classification with Nonmetric Distances: Image Retrieval and Class Representation

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Modified Version of the K-Means Algorithm with a Distance Based on Cluster Symmetry

IEEE Transactions on Pattern Analysis and Machine Intelligence
Searching in metric spaces

ACM Computing Surveys (CSUR)
Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases

ACM Computing Surveys (CSUR)
Searching Multimedia Databases by Content

Searching Multimedia Databases by Content
Modern Information Retrieval

Modern Information Retrieval
Perceptual Metrics for Image Database Navigation

Perceptual Metrics for Image Database Navigation
Empirical evaluation of dissimilarity measures for color and texture

Computer Vision and Image Understanding - Special issue on empirical evaluation of computer vision algorithms
Introduction to Algorithms

Introduction to Algorithms
Searching in metric spaces with user-defined and approximate distances

ACM Transactions on Database Systems (TODS)
Comparing Images Using the Hausdorff Distance

IEEE Transactions on Pattern Analysis and Machine Intelligence
Computation of Normalized Edit Distance and Applications

IEEE Transactions on Pattern Analysis and Machine Intelligence
Fast Nearest-Neighbor Search in Dissimilarity Spaces

IEEE Transactions on Pattern Analysis and Machine Intelligence
DynDex: a dynamic and non-metric space indexer

Proceedings of the tenth ACM international conference on Multimedia
A Pseudo-Metric for Weighted Point Sets

ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part III
Efficient Retrieval of Similar Time Sequences Under Time Warping

ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
On the Surprising Behavior of Distance Metrics in High Dimensional Spaces

ICDT '01 Proceedings of the 8th International Conference on Database Theory
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Near Neighbor Search in Large Metric Spaces

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Robust Face Detection Using the Hausdorff Distance

AVBPA '01 Proceedings of the Third International Conference on Audio- and Video-Based Biometric Person Authentication
A Probabilistic Spell for the Curse of Dimensionality

ALENEX '01 Revised Papers from the Third International Workshop on Algorithm Engineering and Experimentation
Ranking in Spatial Databases

SSD '95 Proceedings of the 4th International Symposium on Advances in Spatial Databases
An Index-Based Approach for Similarity Search Supporting Time Warping in Large Sequence Databases

Proceedings of the 17th International Conference on Data Engineering
Sparse Representations for Image Decomposition with Occlusions

CVPR '96 Proceedings of the 1996 Conference on Computer Vision and Pattern Recognition (CVPR '96)
A Fast Algorithm for Finding k-Nearest Neighbors with Non-Metric Dissimilarity

IWFHR '02 Proceedings of the Eighth International Workshop on Frontiers in Handwriting Recognition (IWFHR'02)
Cluster-preserving Embedding of Proteins

Cluster-preserving Embedding of Proteins
D-Index: Distance Searching Index for Metric Data Sets

Multimedia Tools and Applications
Pivot selection techniques for proximity searching in metric spaces

Pattern Recognition Letters
A Metric for Distributions with Applications to Image Databases

ICCV '98 Proceedings of the Sixth International Conference on Computer Vision
Indexing multi-dimensional time-series with support for multiple distance measures

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Similarity Search in Seismological Signals

ENC '04 Proceedings of the Fifth Mexican International Conference in Computer Science
WARP: Accurate Retrieval of Shapes Using Phase of Fourier Descriptors and Time Warping Distance

IEEE Transactions on Pattern Analysis and Machine Intelligence
A PCA-based similarity measure for multivariate time series

Proceedings of the 2nd ACM international workshop on Multimedia databases
Exact indexing of dynamic time warping

Knowledge and Information Systems
Elastic Translation Invariant Matching of Trajectories

Machine Learning
Multimedia Systems and Content-Based Image Retrieval

Multimedia Systems and Content-Based Image Retrieval
Robust and fast similarity search for moving object trajectories

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Query-sensitive embeddings

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Substructure similarity search in graph databases

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
iDistance: An adaptive B+-tree based indexing method for nearest neighbor search

ACM Transactions on Database Systems (TODS)
Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling)

Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling)
Texture-Based Image Retrieval for Computerized Tomography Databases

CBMS '05 Proceedings of the 18th IEEE Symposium on Computer-Based Medical Systems
On Binary Similarity Measures for Handwritten Character Recognition

ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
Similarity Search: The Metric Space Approach (Advances in Database Systems)

Similarity Search: The Metric Space Approach (Advances in Database Systems)
Closure-Tree: An Index Structure for Graph Queries

ICDE '06 Proceedings of the 22nd International Conference on Data Engineering
Time-dependent semantic similarity measure of queries using historical click-through data

Proceedings of the 15th international conference on World Wide Web
Evaluating Dataflow and Pipelined Vector Processing Architectures for FPGA Co-processors

DSD '06 Proceedings of the 9th EUROMICRO Conference on Digital System Design
Dynamic similarity search in multi-metric spaces

MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
Evaluation campaigns and TRECVid

MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
Detecting Phishing Web Pages with Visual Similarity Assessment Based on Earth Mover's Distance (EMD)

IEEE Transactions on Dependable and Secure Computing
An efficient and accurate method for evaluating time series similarity

Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Learning for efficient retrieval of structured data with noisy queries

Proceedings of the 24th international conference on Machine learning
Efficient similarity search by summarization in large video database

ADC '07 Proceedings of the eighteenth conference on Australasian database - Volume 63
General Hierarchical Model (GHM) to measure similarity of time series

ACM SIGMOD Record
Effective retrieval of polyphonic audio with polyphonic symbolic queries

Proceedings of the international workshop on Workshop on multimedia information retrieval
Unified framework for fast exact and approximate search in dissimilarity spaces

ACM Transactions on Database Systems (TODS)
Reverse kNN search in arbitrary dimensionality

VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Fragment-based approximate retrieval in highly heterogeneous XML collections

Data & Knowledge Engineering
Disorder inequality: a combinatorial approach to nearest neighbor search

WSDM '08 Proceedings of the 2008 International Conference on Web Search and Data Mining
Clustering for metric and non-metric distance measures

Proceedings of the nineteenth annual ACM-SIAM symposium on Discrete algorithms
Image retrieval: Ideas, influences, and trends of the new age

ACM Computing Surveys (CSUR)
Dynamic skyline queries in metric spaces

EDBT '08 Proceedings of the 11th international conference on Extending database technology: Advances in database technology
Adaptive Similarity Measurement Using Relevance Feedback

CITWORKSHOPS '08 Proceedings of the 2008 IEEE 8th International Conference on Computer and Information Technology Workshops
Deformation Modeling for Robust 3D Face Matching

IEEE Transactions on Pattern Analysis and Machine Intelligence
Parallel Computation of Similarity Measures Using an FPGA-Based Processor Array

AINA '08 Proceedings of the 22nd International Conference on Advanced Information Networking and Applications
A Hybrid Approach for XML Similarity

SOFSEM '07 Proceedings of the 33rd conference on Current Trends in Theory and Practice of Computer Science
NM-Tree: Flexible Approximate Similarity Search in Metric and Non-metric Spaces

DEXA '08 Proceedings of the 19th international conference on Database and Expert Systems Applications
Navigation and Discovery in 3D CAD Repositories

IEEE Computer Graphics and Applications
Efficient Correlation Search from Graph Databases

IEEE Transactions on Knowledge and Data Engineering
On Index-Free Similarity Search in Metric Spaces

DEXA '09 Proceedings of the 20th International Conference on Database and Expert Systems Applications
Combinatorial Framework for Similarity Search

SISAP '09 Proceedings of the 2009 Second International Workshop on Similarity Search and Applications
Principles of Information Filtering in Metric Spaces

SISAP '09 Proceedings of the 2009 Second International Workshop on Similarity Search and Applications
Using Tuneable Fuzzy Similarity in Non-metric Search

SISAP '09 Proceedings of the 2009 Second International Workshop on Similarity Search and Applications
On Fuzzy vs. Metric Similarity Search in Complex Databases

FQAS '09 Proceedings of the 8th International Conference on Flexible Query Answering Systems
Evaluation of shape similarity measurement methods for spine X-ray images

Journal of Visual Communication and Image Representation
BoostMap: a method for efficient approximate similarity rankings

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
Voice identification using nearest-neighbor distance measure

ICASSP'93 Proceedings of the 1993 IEEE international conference on Acoustics, speech, and signal processing: speech processing - Volume II
On fast non-metric similarity search by metric access methods

EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Fractional distance measures for content-based image retrieval

ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
Non-metric similarity ranking for image retrieval

DEXA'06 Proceedings of the 17th international conference on Database and Expert Systems Applications
Speech audio retrieval using voice query

ICADL'06 Proceedings of the 9th international conference on Asian Digital Libraries: achievements, Challenges and Opportunities

Where are you heading, metric access methods?: a provocative survey

Proceedings of the Third International Conference on SImilarity Search and APplications
Indexing inexact proximity search with distance regression in pivot space

Proceedings of the Third International Conference on SImilarity Search and APplications
Beyond the metric space model

SIGSPATIAL Special
Shape comparison through mutual distances of real functions

Proceedings of the ACM workshop on 3D object retrieval
Graph-based combinations of fragment descriptors for improved 3D Object Retrieval

Proceedings of the 3rd Multimedia Systems Conference
Future trends in similarity searching

SISAP'12 Proceedings of the 5th international conference on Similarity Search and Applications
Algorithmic exploration of axiom spaces for efficient similarity search at large scale

SISAP'12 Proceedings of the 5th international conference on Similarity Search and Applications
RESYGEN: A Recommendation System Generator using domain-based heuristics

Expert Systems with Applications: An International Journal
Efficient indexing of similarity models with inequality symbolic regression

Proceedings of the 15th annual conference on Genetic and evolutionary computation
Towards efficient indexing of arbitrary similarity: vision paper

ACM SIGMOD Record
Universal indexing of arbitrary similarity models

Proceedings of the VLDB Endowment

Quantified Score

Hi-index	0.00

Visualization

Abstract

The task of similarity search is widely used in various areas of computing, including multimedia databases, data mining, bioinformatics, social networks, etc. In fact, retrieval of semantically unstructured data entities requires a form of aggregated qualification that selects entities relevant to a query. A popular type of such a mechanism is similarity querying. For a long time, the database-oriented applications of similarity search employed the definition of similarity restricted to metric distances. Due to its topological properties, metric similarity can be effectively used to index a database which can then be queried efficiently by so-called metric access methods. However, together with the increasing complexity of data entities across various domains, in recent years there appeared many similarities that were not metrics—we call them nonmetric similarity functions. In this article we survey domains employing nonmetric functions for effective similarity search, and methods for efficient nonmetric similarity search. First, we show that the ongoing research in many of these domains requires complex representations of data entities. Simultaneously, such complex representations allow us to model also complex and computationally expensive similarity functions (often represented by various matching algorithms). However, the more complex similarity function one develops, the more likely it will be a nonmetric. Second, we review state-of-the-art techniques for efficient (fast) nonmetric similarity search, concerning both exact and approximate search. Finally, we discuss some open problems and possible future research trends.