An algorithm for finding nearest neighbours in (approximately) constant average time
Pattern Recognition Letters
Applications of spatial data structures: Computer graphics, image processing, and GIS
Applications of spatial data structures: Computer graphics, image processing, and GIS
The design and analysis of spatial data structures
The design and analysis of spatial data structures
Introduction to statistical pattern recognition (2nd ed.)
Introduction to statistical pattern recognition (2nd ed.)
Query processing for distance metrics
Proceedings of the sixteenth international conference on Very large databases
Approximate closest-point queries in high dimensions
Information Processing Letters
Wavelets and subband coding
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Regular Article: Randomized Nonlinear Projections Uncover High-Dimensional Structure
Advances in Applied Mathematics
Optimal multi-step k-nearest neighbor search
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Approximate nearest neighbors: towards removing the curse of dimensionality
STOC '98 Proceedings of the thirtieth annual ACM symposium on Theory of computing
An optimal algorithm for approximate nearest neighbor searching fixed dimensions
Journal of the ACM (JACM)
The nature of mathematical modeling
The nature of mathematical modeling
Evaluating a class of distance-mapping algorithms for data mining and clustering
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Distance browsing in spatial databases
ACM Transactions on Database Systems (TODS)
The choice of reference points in best-match file searching
Communications of the ACM
Database-friendly random projections
PODS '01 Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Random projection in dimensionality reduction: applications to image and text data
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
Fast Nearest-Neighbor Search in Dissimilarity Spaces
IEEE Transactions on Pattern Analysis and Machine Intelligence
Efficient Color Histogram Indexing for Quadratic Form Distance Functions
IEEE Transactions on Pattern Analysis and Machine Intelligence
Efficient Similarity Search In Sequence Databases
FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
Fast Nearest Neighbor Search in Medical Image Databases
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
SSD '95 Proceedings of the 4th International Symposium on Advances in Spatial Databases
3D Shape Histograms for Similarity Search and Classification in Spatial Databases
SSD '99 Proceedings of the 6th International Symposium on Advances in Spatial Databases
An Approximate Oracle for Distance in Metric Spaces
CPM '98 Proceedings of the 9th Annual Symposium on Combinatorial Pattern Matching
Efficient Time Series Matching by Wavelets
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Cluster-preserving Embedding of Proteins
Cluster-preserving Embedding of Proteins
Virtual landmarks for the internet
Proceedings of the 3rd ACM SIGCOMM conference on Internet measurement
Index-driven similarity search in metric spaces (Survey Article)
ACM Transactions on Database Systems (TODS)
Proceedings of the 2005 ACM SIGMOD international conference on Management of data
On FastMap and the Convex Hull of Multivariate Data: Toward Fast and Robust Dimension Reduction
IEEE Transactions on Pattern Analysis and Machine Intelligence
Online and Offline Character Recognition Using Alignment to Prototypes
ICDAR '05 Proceedings of the Eighth International Conference on Document Analysis and Recognition
A Riemannian approach to graph embedding
Pattern Recognition
ACM Transactions on Database Systems (TODS)
Privacy preserving schema and data matching
Proceedings of the 2007 ACM SIGMOD international conference on Management of data
Unified framework for fast exact and approximate search in dissimilarity spaces
ACM Transactions on Database Systems (TODS)
BoostMap: An Embedding Method for Efficient Nearest Neighbor Retrieval
IEEE Transactions on Pattern Analysis and Machine Intelligence
Handle local optimum traps in CBIR systems
Proceedings of the 2008 ACM symposium on Applied computing
Approximate embedding-based subsequence matching of time series
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Isotree: Tree clustering via metric embedding
Neurocomputing
Brief Communication: Fast embedding methods for clustering tens of thousands of sequences
Computational Biology and Chemistry
Nearest neighbor search methods for handshape recognition
Proceedings of the 1st international conference on PErvasive Technologies Related to Assistive Environments
On efficiently searching trajectories and archival data for historical similarities
Proceedings of the VLDB Endowment
Graph Classification Based on Dissimilarity Space Embedding
SSPR & SPR '08 Proceedings of the 2008 Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
Use of Structured Pattern Representations for Combining Classifiers
SSPR & SPR '08 Proceedings of the 2008 Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
Distributed localization for anisotropic sensor networks
ACM Transactions on Sensor Networks (TOSN)
Towards faster activity search using embedding-based subsequence matching
Proceedings of the 2nd International Conference on PErvasive Technologies Related to Assistive Environments
Transfer non-metric measures into metric for similarity search
MM '09 Proceedings of the 17th ACM international conference on Multimedia
Efficient Similarity Search by Reducing I/O with Compressed Sketches
SISAP '09 Proceedings of the 2009 Second International Workshop on Similarity Search and Applications
Fast shortest path distance estimation in large networks
Proceedings of the 18th ACM conference on Information and knowledge management
Picture extraction from digitized historical manuscripts
Proceedings of the ACM International Conference on Image and Video Retrieval
Face recognition under occlusions and variant expressions with partial similarity
IEEE Transactions on Information Forensics and Security
Adapting indexing trees to data distribution in feature spaces
Computer Vision and Image Understanding
A linguistic approach to classification of bacterial genomes
Pattern Recognition
Practical connectivity-based routing in wireless sensor networks using dimension reduction
SECON'09 Proceedings of the 6th Annual IEEE communications society conference on Sensor, Mesh and Ad Hoc Communications and Networks
Graph classification by means of Lipschitz embedding
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Graph embedding in vector spaces by means of prototype selection
GbRPR'07 Proceedings of the 6th IAPR-TC-15 international conference on Graph-based representations in pattern recognition
An energy minimisation approach to attributed graph regularisation
EMMCVPR'07 Proceedings of the 6th international conference on Energy minimization methods in computer vision and pattern recognition
Real-time Object Recognition in Sparse Range Images Using Error Surface Embedding
International Journal of Computer Vision
International Journal of Intelligent Systems Technologies and Applications
A database-based framework for gesture recognition
Personal and Ubiquitous Computing
Privacy-preserving matching of spatial datasets with protection against background knowledge
Proceedings of the 18th SIGSPATIAL International Conference on Advances in Geographic Information Systems
Lag patterns in time series databases
DEXA'10 Proceedings of the 21st international conference on Database and expert systems applications: Part II
BoostMap: a method for efficient approximate similarity rankings
CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition
A Matrix Computation View of FastMap and RobustMap Dimension Reduction Algorithms
SIAM Journal on Matrix Analysis and Applications
Techniques for similarity searching in multimedia databases
Proceedings of the VLDB Endowment
Enhancing Clustering Quality through Landmark-Based Dimensionality Reduction
ACM Transactions on Knowledge Discovery from Data (TKDD)
A privacy preserving efficient protocol for semantic similarity join using long string attributes
Proceedings of the 4th International Workshop on Privacy and Anonymity in the Information Society
Selecting vantage objects for similarity indexing
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
Embedding-based subsequence matching in time-series databases
ACM Transactions on Database Systems (TODS)
Automatic segmentation of digitalized historical manuscripts
Multimedia Tools and Applications
Automatic human action recognition in videos by graph embedding
ICIAP'11 Proceedings of the 16th international conference on Image analysis and processing - Volume Part II
Fully automatic 3D facial expression recognition using a region-based approach
J-HGBU '11 Proceedings of the 2011 joint ACM workshop on Human gesture and behavior understanding
Transforming strings to vector spaces using prototype selection
SSPR'06/SPR'06 Proceedings of the 2006 joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition
ISOLLE: locally linear embedding with geodesic distance
PKDD'05 Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases
On fast non-metric similarity search by metric access methods
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Accelerating video identification by skipping queries with a compact metric cache
ICCSA'10 Proceedings of the 2010 international conference on Computational Science and Its Applications - Volume Part IV
Graph clustering using heat content invariants
IbPRIA'05 Proceedings of the Second Iberian conference on Pattern Recognition and Image Analysis - Volume Part II
Color indexing by nonparametric statistics
ICIAR'05 Proceedings of the Second international conference on Image Analysis and Recognition
Indexing issues in supporting similarity searching
PCM'04 Proceedings of the 5th Pacific Rim Conference on Advances in Multimedia Information Processing - Volume Part II
Efficient Privacy Preserving Protocols for Similarity Join
Transactions on Data Privacy
Efficient and Practical Approach for Private Record Linkage
Journal of Data and Information Quality (JDIQ)
Frequent grams based embedding for privacy preserving record linkage
Proceedings of the 21st ACM international conference on Information and knowledge management
Human action recognition in video by fusion of structural and spatio-temporal features
SSPR'12/SPR'12 Proceedings of the 2012 Joint IAPR international conference on Structural, Syntactic, and Statistical Pattern Recognition
Discriminative prototype selection methods for graph embedding
Pattern Recognition
A taxonomy of privacy-preserving record linkage techniques
Information Systems
Efficient geometric graph matching using vertex embedding
Proceedings of the 21st ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
Hi-index | 0.14 |
Complex data types驴such as images, documents, DNA sequences, etc.驴are becoming increasingly important in modern database applications. A typical query in many of these applications seeks to find objects that are similar to some target object, where (dis)similarity is defined by some distance function. Often, the cost of evaluating the distance between two objects is very high. Thus, the number of distance evaluations should be kept at a minimum, while (ideally) maintaining the quality of the result. One way to approach this goal is to embed the data objects in a vector space so that the distances of the embedded objects approximates the actual distances. Thus, queries can be performed (for the most part) on the embedded objects. In this paper, we are especially interested in examining the issue of whether or not the embedding methods will ensure that no relevant objects are left out (i.e., there are no false dismissals and, hence, the correct result is reported). Particular attention is paid to the SparseMap, FastMap, and MetricMap embedding methods. SparseMap is a variant of Lipschitz embeddings, while FastMap and MetricMap are inspired by dimension reduction methods for Euclidean spaces (using KLT or the related PCA and SVD). We show that, in general, none of these embedding methods guarantee that queries on the embedded objects have no false dismissals, while also demonstrating the limited cases in which the guarantee does hold. Moreover, we describe a variant of SparseMap that allows queries with no false dismissals. In addition, we show that with FastMap and MetricMap, the distances of the embedded objects can be much greater than the actual distances. This makes it impossible (or at least impractical) to modify FastMap and MetricMap to guarantee no false dismissals.