Algorithms for clustering data
Algorithms for clustering data
Distance-based indexing for high-dimensional metric spaces
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
Dimensionality reduction for similarity searching in dynamic databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces
VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Near Neighbor Search in Large Metric Spaces
VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
BM+-Tree: a hyperplane-based index method for high-dimensional metric spaces
DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications
Dynamic optimization of queries in pivot-based indexing
Multimedia Tools and Applications
Hi-index | 0.00 |
One of the common query patterns is to find approximate matches to a given query object in a large database. This kind of query processing is referred as similarity search in a metric space. In this paper, we propose a new metric index MB+tree, called Metric B+tree, which supports near neighbour searching in a generic metric space. MB+tree is aimed at reducing both the number of I/O accesses and the number of distance calculations for similarity search in large databases, while allowing dynamic data updates. In this paper, we show that a B+tree, with an auxiliary tree, can be used as a metric index. Unlike other multi-dimensional (spatial) access methods, using our approach, we can partition data into disjoint partitions while building/maintaining a metric index, which can lead to a significant cost reduction since the number of metric sub-spaces to be searched is reduced. In order to use MB+tree, a slicing value is proposed. With the slicing value, in addition to space division information, a near neighbour searching can be systematically converted to a range search in B+tree. Several different slicing values are considered namely, one-focus-point scheme and two-focus-point scheme. We also conducted extensive experimental studies using synthetic data. Results are reported in this paper.