Introduction to statistical pattern recognition (2nd ed.)
Introduction to statistical pattern recognition (2nd ed.)
The R*-tree: an efficient and robust access method for points and rectangles
SIGMOD '90 Proceedings of the 1990 ACM SIGMOD international conference on Management of data
SIGMOD '95 Proceedings of the 1995 ACM SIGMOD international conference on Management of data
The SR-tree: an index structure for high-dimensional nearest neighbor queries
SIGMOD '97 Proceedings of the 1997 ACM SIGMOD international conference on Management of data
A cost model for nearest neighbor search in high-dimensional data space
PODS '97 Proceedings of the sixteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
The pyramid-technique: towards breaking the curse of dimensionality
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Fast algorithms for projected clustering
SIGMOD '99 Proceedings of the 1999 ACM SIGMOD international conference on Management of data
Finding generalized projected clusters in high dimensional spaces
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
C2VA: Trim High Dimensional Indexes
WAIM '02 Proceedings of the Third International Conference on Advances in Web-Age Information Management
CVA file: an index structure for high-dimensional datasets
Knowledge and Information Systems
DDR: an index method for large time-series datasets
Information Systems
A new indexing method for high dimensional dataset
DASFAA'05 Proceedings of the 10th international conference on Database Systems for Advanced Applications
Hi-index | 0.00 |
This paper proposes a new dimensionality reduction technique and an indexing mechanism for high dimensional data sets in which data points are not uniformly distributed. The proposed technique decomposes a data space into convex polyhedra, and the dimensionality of each data point is reduced according to which polyhedron includes the data point. One of the advantages of the proposed technique is that it reduces the dimensionality locally. This local dimensionality reduction contributes to improve indexing mechanisms for non-uniformly distributed data sets.To show the applicability and the effectiveness of the proposed technique, this paper describes a new indexing mechanism called CVA-file (Compact VA-File) which is a revised version of the VA-file. With the proposed dimensionality reduction technique, the size of data points stored in index files can be reduced. Furthermore, it can estimate upper and lower bounds of each entry in index files by using geographic properties of convex polyhedra. Results from experimental simulations show that the CVA-file is better than the VA-file for non-uniformly distributed real data sets.