Spatial indexing of high-dimensional data based on relative approximation

Authors:
Yasushi Sakurai;Masatoshi Yoshikawa;Shunsuke Uemura;Haruhiko Kojima
Affiliations:
NTT Cyber Space Laboratories, 1-1 Hikari-no-oka, Yokosuka, Kanagawa 239-0847, Japan/ e-mail: sakurai.yasushi&commat/lab.ntt.co.jp, kojima&commat/aether.hil.ntt.co.jp;Nara Institute of Science and Technology, 8916-5 Takayama, Ikoma, Nara 630-0101, Japan/ e-mail: &rcub/yosikawa, uemura&rcub/&commat/is.aist-nara.ac.jp;Nara Institute of Science and Technology, 8916-5 Takayama, Ikoma, Nara 630-0101, Japan/ e-mail: &rcub/yosikawa, uemura&rcub/&commat/is.aist-nara.ac.jp;NTT Cyber Space Laboratories, 1-1 Hikari-no-oka, Yokosuka, Kanagawa 239-0847, Japan/ e-mail: sakurai.yasushi&commat/lab.ntt.co.jp, kojima&commat/aether.hil.ntt.co.jp
Venue:
The VLDB Journal — The International Journal on Very Large Data Bases
Year:
2002

Citing 22
Cited 6

The buddy tree: an efficient and robust access method for spatial data base

Proceedings of the sixteenth international conference on Very large databases
The R*-tree: an efficient and robust access method for points and rectangles

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
Visual learning and recognition of 3-D objects from appearance

International Journal of Computer Vision
Nearest neighbor queries

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
The SR-tree: an index structure for high-dimensional nearest neighbor queries

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
A cost model for nearest neighbor search in high-dimensional data space

PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Multidimensional access methods

ACM Computing Surveys (CSUR)
An optimal algorithm for approximate nearest neighbor searching

SODA '94 Proceedings of the fifth annual ACM-SIAM symposium on Discrete algorithms
R-trees: a dynamic index structure for spatial searching

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Intelligent Access to Digital Video: Informedia Project

Computer
On the 'Dimensionality Curse' and the 'Self-Similarity Blessing'

IEEE Transactions on Knowledge and Data Engineering
Similarity Indexing with the SS-tree

ICDE '96 Proceedings of the Twelfth International Conference on Data Engineering
A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
Similarity Search in High Dimensions via Hashing

VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
The A-tree: An Index Structure for High-Dimensional Spaces Using Relative Approximation

VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Hilbert R-tree: An Improved R-tree using Fractals

VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Estimating the Selectivity of Spatial Queries Using the `Correlation' Fractal Dimension

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Analysis of n-Dimensional Quadtrees using the Hausdorff Fractal Dimension

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
The X-tree: An Index Structure for High-Dimensional Data

VLDB '96 Proceedings of the 22th International Conference on Very Large Data Bases
Multidimensional Access Methods: Trees Have Grown Everywhere

VLDB '97 Proceedings of the 23rd International Conference on Very Large Data Bases
Ranking in Spatial Databases

SSD '95 Proceedings of the 4th International Symposium on Advances in Spatial Databases
Independent Quantization: An Index Compression Technique for High-Dimensional Data Spaces

ICDE '00 Proceedings of the 16th International Conference on Data Engineering

A hierarchical bitmap indexing method for content based multimedia retrieval

IMSA'06 Proceedings of the 24th IASTED international conference on Internet and multimedia systems and applications
Subspace tree: high dimensional multimedia indexing with logarithmic temporal complexity

Journal of Intelligent Information Systems
MBR compression in spatial databases using semi-approximation scheme

KES'06 Proceedings of the 10th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part I
Spatial indexing based on the semi-approximation scheme of MBR

ICCSA'06 Proceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part II
Spatial index compression for location-based services based on a MBR semi-approximation scheme

WAIM '06 Proceedings of the 7th international conference on Advances in Web-Age Information Management
A spatial index using MBR compression and hashing technique for mobile map service

DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose a novel index structure, the A-tree (approximation tree), for similarity searches in high-dimensional data. The basic idea of the A-tree is the introduction of virtual bounding rectangles (VBRs) which contain and approximate MBRs or data objects. VBRs can be represented quite compactly and thus affect the tree configuration both quantitatively and qualitatively. First, since tree nodes can contain a large number of VBR entries, fanout becomes large, which increases search speed. More importantly, we have a free hand in arranging MBRs and VBRs in the tree nodes. Each A-tree node contains an MBR and its children VBRs. Therefore, by fetching an A-tree node, we can obtain information on the exact position of a parent MBR and the approximate position of its children. We have performed experiments using both synthetic and real data sets. For the real data sets, the A-tree outperforms the SR-tree and the VA-file in all dimensionalities up to 64 dimensions, which is the highest dimension in our experiments. Additionally, we propose a cost model for the A-tree. We verify the validity of the cost model for synthetic and real data sets.