An algorithm for finding nearest neighbours in (approximately) constant average time
Pattern Recognition Letters
Vorono trees and clustering problems
Information Systems
Efficient processing of spatial joins using R-trees
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Distance-based indexing for high-dimensional metric spaces
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Data structures and algorithms for nearest neighbor search in general metric spaces
SODA '93 Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms
A vector space model for automatic indexing
Communications of the ACM
Some approaches to best-match file searching
Communications of the ACM
Epsilon grid order: an algorithm for the similarity join on massive high-dimensional data
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
ACM Computing Surveys (CSUR)
Modern Information Retrieval
Flexible pattern matching in strings: practical on-line search algorithms for texts and biological sequences
Fixed Queries Array: A Fast and Economical Data Structure for Proximity Searching
Multimedia Tools and Applications
R-trees: a dynamic index structure for spatial searching
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
The Design and Implementation of Seeded Trees: An Efficient Method for Spatial Joins
IEEE Transactions on Knowledge and Data Engineering
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Near Neighbor Search in Large Metric Spaces
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Approximate k -Closest-Pairs with Space Filling Curves
DaWaK 2000 Proceedings of the 4th International Conference on Data Warehousing and Knowledge Discovery
Proximity Matching Using Fixed-Queries Trees
CPM '94 Proceedings of the 5th Annual Symposium on Combinatorial Pattern Matching
Searching in metric spaces by spatial approximation
The VLDB Journal — The International Journal on Very Large Data Bases
Similarity Join for Low-and High-Dimensional Data
DASFAA '03 Proceedings of the Eighth International Conference on Database Systems for Advanced Applications
D-Index: Distance Searching Index for Metric Data Sets
Multimedia Tools and Applications
Index-driven similarity search in metric spaces (Survey Article)
ACM Transactions on Database Systems (TODS)
An approximate algorithm for top-k closest pairs join query in large high dimensional data
Data & Knowledge Engineering
Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling)
A compact space decomposition for effective metric indexing
Pattern Recognition Letters
Similarity Search: The Metric Space Approach (Advances in Database Systems)
Similarity Search: The Metric Space Approach (Advances in Database Systems)
t-Spanners for metric space searching
Data & Knowledge Engineering
A Data Structure and an Algorithm for the Nearest Point Problem
IEEE Transactions on Software Engineering
Dynamic spatial approximation trees
Journal of Experimental Algorithmics (JEA)
Distributed Sparse Spatial Selection Indexes
PDP '08 Proceedings of the 16th Euromicro Conference on Parallel, Distributed and Network-Based Processing (PDP 2008)
List of Twin Clusters: A Data Structure for Similarity Joins in Metric Spaces
SISAP '08 Proceedings of the First International Workshop on Similarity Search and Applications (sisap 2008)
Similarity join in metric spaces
ECIR'03 Proceedings of the 25th European conference on IR research
Practical construction of k-nearest neighbor graphs in metric spaces
WEA'06 Proceedings of the 5th international conference on Experimental Algorithms
On the least cost for proximity searching in metric spaces
WEA'06 Proceedings of the 5th international conference on Experimental Algorithms
Recursive lists of clusters: a dynamic data structure for range queries in metric spaces
ISCIS'05 Proceedings of the 20th international conference on Computer and Information Sciences
Efficient parallelization of spatial approximation trees
ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part I
Using the k-nearest neighbor graph for proximity searching in metric spaces
SPIRE'05 Proceedings of the 12th international conference on String Processing and Information Retrieval
Finding the k-closest pairs in metric spaces
Proceedings of the 1st Workshop on New Trends in Similarity Search
The list of clusters revisited
MCPR'12 Proceedings of the 4th Mexican conference on Pattern Recognition
Hi-index | 0.00 |
The metric space model abstracts many proximity or similarity problems, where the most frequently considered primitives are range and k-nearest neighbor search, leaving out the similarity join, an extremely important primitive. In fact, despite the great attention that this primitive has received in traditional and even multidimensional databases, little has been done for general metric databases. We solve two variants of the similarity join problem: (1) range joins: Given two sets of objects and a distance threshold r, find all the object pairs (one from each set) at distance at most r; and (2) k-closest pair joins: Find the k closest object pairs (one from each set). For this sake, we devise a new metric index, coined List of Twin Clusters (LTC), which indexes both sets jointly, instead of the natural approach of indexing one or both sets independently. Finally, we show how to use the LTC in order to solve classical range queries. Our results show significant speedups over the basic quadratic-time naive alternative for both join variants, and that the LTC is competitive with the original list of clusters when solving range queries. Furthermore, we show that our technique has a great potential for improvements.