PK-tree: a spatial index structure for high dimensional point data

Authors:
Wei Wang;Jiong Yang;Richard Muntz
Affiliations:
IBM T.J. Watson Research Center, Yorktown Heights, NY;IBM T.J. Watson Research Center, Yorktown Heights, NY;Univ. of California, Los Angeles, Los Angeles
Venue:
Information organization and databases
Year:
2000

Citing 11
Cited 1

The design and analysis of spatial data structures

The design and analysis of spatial data structures
The LSD tree: spatial access to multidimensional and non-point objects

VLDB '89 Proceedings of the 15th international conference on Very large data bases
The R*-tree: an efficient and robust access method for points and rectangles

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
The SR-tree: an index structure for high-dimensional nearest neighbor queries

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
The K-D-B-tree: a search structure for large multidimensional dynamic indexes

SIGMOD '81 Proceedings of the 1981 ACM SIGMOD international conference on Management of data
R-trees: a dynamic index structure for spatial searching

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
The TV-tree: an index structure for high-dimensional data

The VLDB Journal — The International Journal on Very Large Data Bases - Spatial Database Systems
M-tree: An Efficient Access Method for Similarity Search in Metric Spaces

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
The R+-Tree: A Dynamic Index for Multi-Dimensional Objects

VLDB '87 Proceedings of the 13th International Conference on Very Large Data Bases
Hilbert R-tree: An Improved R-tree using Fractals

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases

Multidimensional data structures for spatial applications

Algorithms and theory of computation handbook

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this chapter we present the PK-tree which is an index structure for high dimensional point data. The proposed indexing structure can be viewed as combining aspects of the PR-quad or K-D tree but where unnecessary nodes are eliminated. The unnecessary nodes are typically the result of skew in the point distribution and we show that by eliminating these nodes the performance of the resulting index is robust to skewed data distributions. The index structure is formally defined, efficiently updateable and bounds on the number of nodes and the mean height of the tree can be proved. Bounds on the expected height of the tree can be given under certain mild constraints on the spatial distribution of points. Empirical evidence both on real data sets and generated data sets shows that the PK-tree outperforms the recently proposed spatial indexes based on the R-tree and X-tree by a wide margin. It is also significant that the relative performance advantage of the PK-tree grows with the dimensionality of the data set.