SEA-CNN: Scalable Processing of Continuous K-Nearest Neighbor Queries in Spatio-temporal Databases
ICDE '05 Proceedings of the 21st International Conference on Data Engineering
BORDER: Efficient Computation of Boundary Points
IEEE Transactions on Knowledge and Data Engineering
Efficient index-based KNN join processing for high-dimensional data
Information and Software Technology
A fast all nearest neighbor algorithm for applications involving large point-clouds
Computers and Graphics
Gorder: an efficient method for KNN join processing
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Seamlessly integrating similarity queries in SQL
Software—Practice & Experience
Periodic Pattern Analysis in Time Series Databases
DASFAA '09 Proceedings of the 14th International Conference on Database Systems for Advanced Applications
Design and evaluation of trajectory join algorithms
Proceedings of the 17th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
High-dimensional kNN joins with incremental updates
Geoinformatica
Adaptive k-nearest-neighbor classification using a dynamic number of nearest neighbors
ADBIS'07 Proceedings of the 11th East European conference on Advances in databases and information systems
Optimizing all-nearest-neighbor queries with trigonometric pruning
SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering
A fast hybrid classification algorithm based on the minimum distance and the k-NN classifiers
Proceedings of the Fourth International Conference on SImilarity Search and APplications
Finding the sites with best accessibilities to amenities
DASFAA'11 Proceedings of the 16th international conference on Database systems for advanced applications: Part II
PAKDD'06 Proceedings of the 10th Pacific-Asia conference on Advances in Knowledge Discovery and Data Mining
Closest pair queries with spatial constraints
PCI'05 Proceedings of the 10th Panhellenic conference on Advances in Informatics
Finding data broadness via generalized nearest neighbors
EDBT'06 Proceedings of the 10th international conference on Advances in Database Technology
Scalable continuous query processing and moving object indexing in spatio-temporal databases
EDBT'06 Proceedings of the 2006 international conference on Current Trends in Database Technology
Efficient parallel kNN joins for large data in MapReduce
Proceedings of the 15th International Conference on Extending Database Technology
Efficient processing of k nearest neighbor joins using MapReduce
Proceedings of the VLDB Endowment
Parallel k-most similar neighbor classifier for mixed data
IDEAL'12 Proceedings of the 13th international conference on Intelligent Data Engineering and Automated Learning
A fast k-neighborhood algorithm for large point-clouds
SPBG'06 Proceedings of the 3rd Eurographics / IEEE VGTC conference on Point-Based Graphics
Proceedings of the 25th International Conference on Scientific and Statistical Database Management
Similarity queries: their conceptual evaluation, transformations, and processing
The VLDB Journal — The International Journal on Very Large Data Bases
Reverse-k-Nearest-Neighbor join processing
SSTD'13 Proceedings of the 13th international conference on Advances in Spatial and Temporal Databases
Hi-index | 0.00 |
The similarity join has become an important database primitive for supporting similarity searches and data mining. A similarity join combines two sets of complex objects such that the result contains all pairs of similar objects. Two types of the similarity join are well-known, the distance range join, in which the user defines a distance threshold for the join, and the closest pair query or k-distance join, which retrieves the k most similar pairs. In this paper, we propose an important, third similarity join operation called the k-nearest neighbour join, which combines each point of one point set with its k nearest neighbours in the other set. We discover that many standard algorithms of Knowledge Discovery in Databases (KDD) such as k-means and k-medoid clustering, nearest neighbour classification, data cleansing, postprocessing of sampling-based data mining, etc. can be implemented on top of the k-nn join operation to achieve performance improvements without affecting the quality of the result of these algorithms. We propose a new algorithm to compute the k-nearest neighbour join using the multipage index (MuX), a specialised index structure for the similarity join. To reduce both CPU and I/O costs, we develop optimal loading and processing strategies.