The convex polyhedra technique: an index structure for high-dimensional space

Authors:
Jiyuan An;Hanxiong Chen;Kazutaka Furuse;Masahiro Ishikawa;Nobuo Ohbo
Affiliations:
University of Tsukuba, 1-1-1 Tennoudai, Tsukuba City, Ibaraki Japan;University of Tsukuba, 1-1-1 Tennoudai, Tsukuba City, Ibaraki Japan;University of Tsukuba, 1-1-1 Tennoudai, Tsukuba City, Ibaraki Japan;National Institute of Agrobiological Sciences;University of Tsukuba, 1-1-1 Tennoudai, Tsukuba City, Ibaraki Japan
Venue:
ADC '02 Proceedings of the 13th Australasian database conference - Volume 5
Year:
2002

Citing 9
Cited 4

Introduction to statistical pattern recognition (2nd ed.)

Introduction to statistical pattern recognition (2nd ed.)
The R*-tree: an efficient and robust access method for points and rectangles

SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets

SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
The SR-tree: an index structure for high-dimensional nearest neighbor queries

SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
A cost model for nearest neighbor search in high-dimensional data space

PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
The pyramid-technique: towards breaking the curse of dimensionality

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Fast algorithms for projected clustering

SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Finding generalized projected clusters in high dimensional spaces

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
A Quantitative Analysis and Performance Study for Similarity-Search Methods in High-Dimensional Spaces

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases

C2VA: Trim High Dimensional Indexes

WAIM '02 Proceedings of the Third International Conference on Advances in Web-Age Information Management
CVA file: an index structure for high-dimensional datasets

Knowledge and Information Systems
DDR: an index method for large time-series datasets

Information Systems
A new indexing method for high dimensional dataset

DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a new dimensionality reduction technique and an indexing mechanism for high dimensional data sets in which data points are not uniformly distributed. The proposed technique decomposes a data space into convex polyhedra, and the dimensionality of each data point is reduced according to which polyhedron includes the data point. One of the advantages of the proposed technique is that it reduces the dimensionality locally. This local dimensionality reduction contributes to improve indexing mechanisms for non-uniformly distributed data sets.To show the applicability and the effectiveness of the proposed technique, this paper describes a new indexing mechanism called CVA-file (Compact VA-File) which is a revised version of the VA-file. With the proposed dimensionality reduction technique, the size of data points stored in index files can be reduced. Furthermore, it can estimate upper and lower bounds of each entry in index files by using geographic properties of convex polyhedra. Results from experimental simulations show that the CVA-file is better than the VA-file for non-uniformly distributed real data sets.