Computational geometry: an introduction
Computational geometry: an introduction
Spatial query processing in an object-oriented database system
SIGMOD '86 Proceedings of the 1986 ACM SIGMOD international conference on Management of data
Applications of spatial data structures: Computer graphics, image processing, and GIS
Applications of spatial data structures: Computer graphics, image processing, and GIS
The design and analysis of spatial data structures
The design and analysis of spatial data structures
Linear clustering of objects with multiple attributes
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
SCG '92 Proceedings of the eighth annual symposium on Computational geometry
Efficient processing of spatial joins using R-trees
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
Efficient and effective querying by image content
Journal of Intelligent Information Systems - Special issue: advances in visual information management systems
Hashing by proximity to process duplicates in spatial databases
CIKM '94 Proceedings of the third international conference on Information and knowledge management
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
Partition based spatial-merge join
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Evaluating a class of distance-mapping algorithms for data mining and clustering
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Data structures and algorithms for nearest neighbor search in general metric spaces
SODA '93 Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms
The Grid File: An Adaptable, Symmetric Multikey File Structure
ACM Transactions on Database Systems (TODS)
High performance clustering based on the similarity join
Proceedings of the ninth international conference on Information and knowledge management
Epsilon grid order: an algorithm for the similarity join on massive high-dimensional data
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
GESS: a scalable similarity-join algorithm for mining large data sets in high dimensional spaces
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
ACM Computing Surveys (CSUR)
Efficient Visual Recognition Using the Hausdorff Distance
Efficient Visual Recognition Using the Hausdorff Distance
Fundamentals of Database Systems
Fundamentals of Database Systems
Introduction to Algorithms
The K-D-B-tree: a search structure for large multidimensional dynamic indexes
SIGMOD '81 Proceedings of the 1981 ACM SIGMOD international conference on Management of data
R-trees: a dynamic index structure for spatial searching
SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Computing in Science and Engineering
High Dimensional Similarity Joins: Algorithms and Performance Evaluation
IEEE Transactions on Knowledge and Data Engineering
High-Dimensional Similarity Joins
IEEE Transactions on Knowledge and Data Engineering
Approximate Processing of Multiway Spatial Joins in Very Large Databases
EDBT '02 Proceedings of the 8th International Conference on Extending Database Technology: Advances in Database Technology
Efficient Similarity Search In Sequence Databases
FODO '93 Proceedings of the 4th International Conference on Foundations of Data Organization and Algorithms
Integrated Query Processing Strategies for Spatial Path Queries
ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
High-Dimensional Similarity Joins
ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
High Dimensional Similarity Joins: Algorithms and Performance Evaluation
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
Join Strategies on KB-Tree Indexed Relations
Proceedings of the Fifth International Conference on Data Engineering
A Cost Model and Index Architecture for the Similarity Join
Proceedings of the 17th International Conference on Data Engineering
Scalable Sweeping-Based Spatial Join
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Query Processing for Multi-Attribute Clustered Records
VLDB '90 Proceedings of the 16th International Conference on Very Large Data Bases
Near Neighbor Search in Large Metric Spaces
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Fast Nearest Neighbor Search in Medical Image Databases
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Singularities Make Spatial Join Scheduling Hard
ISAAC '97 Proceedings of the 8th International Symposium on Algorithms and Computation
Strategies for Optimizing the Use of Redundancy in Spatial Databases
SSD '89 Proceedings of the First Symposium on Design and Implementation of Large Spatial Databases
Generating Seeded Trees from Data Sets
SSD '95 Proceedings of the 4th International Symposium on Advances in Spatial Databases
Discovery of Spatial Association Rules in Geographic Information Databases
SSD '95 Proceedings of the 4th International Symposium on Advances in Spatial Databases
Similarity Join for Low-and High-Dimensional Data
DASFAA '03 Proceedings of the Eighth International Conference on Database Systems for Advanced Applications
The quad-CIF tree: A data structure for hierarchical on-line algorithms
DAC '82 Proceedings of the 19th Design Automation Conference
Locating objects using the Hausdorff distance
ICCV '95 Proceedings of the Fifth International Conference on Computer Vision
Data Redundancy and Duplicate Detection in Spatial Join Processing
ICDE '00 Proceedings of the 16th International Conference on Data Engineering
Cluster-preserving Embedding of Proteins
Cluster-preserving Embedding of Proteins
D-Index: Distance Searching Index for Metric Data Sets
Multimedia Tools and Applications
ACM Transactions on Database Systems (TODS)
Towards systematic design of distance functions for data mining applications
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Index-driven similarity search in metric spaces (Survey Article)
ACM Transactions on Database Systems (TODS)
Improvements in Distance-Based Indexing
SSDBM '04 Proceedings of the 16th International Conference on Scientific and Statistical Database Management
All-Nearest-Neighbors Queries in Spatial Databases
SSDBM '04 Proceedings of the 16th International Conference on Scientific and Statistical Database Management
Joining interval data in relational databases
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling)
Similarity Search: The Metric Space Approach (Advances in Database Systems)
Similarity Search: The Metric Space Approach (Advances in Database Systems)
ACM Transactions on Database Systems (TODS)
Scheduling of page-fetches in join operations
VLDB '81 Proceedings of the seventh international conference on Very Large Data Bases - Volume 7
Gorder: an efficient method for KNN join processing
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Similarity join in metric spaces
ECIR'03 Proceedings of the 25th European conference on IR research
SimDB: a similarity-aware database system
Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Similarity joins as stronger metric operations
SIGSPATIAL Special
Generalizing prefix filtering to improve set similarity joins
Information Systems
Set similarity join on probabilistic data
Proceedings of the VLDB Endowment
Finding the k-closest pairs in metric spaces
Proceedings of the 1st Workshop on New Trends in Similarity Search
Pass-join: a partition-based method for similarity joins
Proceedings of the VLDB Endowment
A trajectory correlation algorithm based on users' daily routines
Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
Can we beat the prefix filtering?: an adaptive framework for similarity join and search
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Exploiting MapReduce-based similarity joins
SIGMOD '12 Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data
Clustering user trajectories to find patterns for social interaction applications
W2GIS'12 Proceedings of the 11th international conference on Web and Wireless Geographical Information Systems
MapReduce-based similarity join for metric spaces
Proceedings of the 1st International Workshop on Cloud Intelligence
Exploiting database similarity joins for metric spaces
Proceedings of the VLDB Endowment
A partition-based method for string similarity joins with edit-distance constraints
ACM Transactions on Database Systems (TODS)
Scalable all-pairs similarity search in metric spaces
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Similarity queries: their conceptual evaluation, transformations, and processing
The VLDB Journal — The International Journal on Very Large Data Bases
Extending string similarity join to tolerant fuzzy token matching
ACM Transactions on Database Systems (TODS)
Hi-index | 0.00 |
Similarity join algorithms find pairs of objects that lie within a certain distance ε of each other. Algorithms that are adapted from spatial join techniques are designed primarily for data in a vector space and often employ some form of a multidimensional index. For these algorithms, when the data lies in a metric space, the usual solution is to embed the data in vector space and then make use of a multidimensional index. Such an approach has a number of drawbacks when the data is high dimensional as we must eventually find the most discriminating dimensions, which is not trivial. In addition, although the maximum distance between objects increases with dimension, the ability to discriminate between objects in each dimension does not. These drawbacks are overcome via the introduction of a new method called Quickjoin that does not require a multidimensional index and instead adapts techniques used in distance-based indexing for use in a method that is conceptually similar to the Quicksort algorithm. A formal analysis is provided of the Quickjoin method. Experiments show that the Quickjoin method significantly outperforms two existing techniques.