The R*-tree: an efficient and robust access method for points and rectangles
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
An analysis of schedules for performing multi-page requests
Information Systems
Optimal multi-step k-nearest neighbor search
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Influence sets based on reverse nearest neighbor queries
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient Scheduling of Page Access in Index-Based Join Processing
IEEE Transactions on Knowledge and Data Engineering
STR: A Simple and Efficient Algorithm for R-Tree Packing
ICDE '97 Proceedings of the Thirteenth International Conference on Data Engineering
Fast Nearest Neighbor Search in High-Dimensional Space
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
When Is ''Nearest Neighbor'' Meaningful?
ICDT '99 Proceedings of the 7th International Conference on Database Theory
An Index Structure for Efficient Reverse Nearest Neighbor Queries
Proceedings of the 17th International Conference on Data Engineering
Discovery of Influence Sets in Frequently Updated Databases
Proceedings of the 27th International Conference on Very Large Data Bases
Hilbert R-tree: An Improved R-tree using Fractals
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Fast Nearest Neighbor Search in Medical Image Databases
VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
SSD '95 Proceedings of the 4th International Symposium on Advances in Spatial Databases
Minimum Redundancy Feature Selection from Microarray Gene Expression Data
CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
The k-Nearest Neighbour Join: Turbo Charging the KDD Process
Knowledge and Information Systems
Scheduling of page-fetches in join operations
VLDB '81 Proceedings of the seventh international conference on Very Large Data Bases - Volume 7
Reverse kNN search in arbitrary dimensionality
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Gorder: an efficient method for KNN join processing
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Towards evaluating GRASIM for ontology-based data matching
OTM'10 Proceedings of the 2010 international conference on On the move to meaningful internet systems: Part II
Hi-index | 0.00 |
A data object is broad if it is one of the k-Nearest Neighbors (k-NN) of many data objects. We introduce a new database primitive called Generalized Nearest Neighbor (GNN) to express data broadness. We also develop three strategies to answer GNN queries efficiently for large datasets of multidimensional objects. The R*-Tree based search algorithm generates candidate pages and ranks them based on their distances. Our first algorithm, Fetch All (FA), fetches as many candidate pages as possible. Our second algorithm, Fetch One (FO), fetches one candidate page at a time. Our third algorithm, Fetch Dynamic (FD), dynamically decides on the number of pages that needs to be fetched. We also propose three optimizations, Column Filter, Row Filter and Adaptive Filter, to eliminate pages from each dataset. Column Filter prunes the pages that are guaranteed to be non-broad. Row Filter prunes the pages whose removal do not change the broadness of any data point. Adaptive Filter prunes the search space dynamically along each dimension to eliminate unpromising objects. Our experiments show that FA is the fastest when the buffer size is large and FO is the fastest when the buffer size is small. FD is always either fastest or very close to the faster of FA and FO. FD is significantly faster than the existing methods adapted to the GNN problem.