SubSpace Projection: A unified framework for a class of partition-based dimension reduction techniques

  • Authors:
  • Hao Cheng;Khanh Vu;Kien A. Hua

  • Affiliations:
  • School of Electrical Engineering and Computer Science, University of Central Florida, Orlando, FL 32816, USA;School of Electrical Engineering and Computer Science, University of Central Florida, Orlando, FL 32816, USA;School of Electrical Engineering and Computer Science, University of Central Florida, Orlando, FL 32816, USA

  • Venue:
  • Information Sciences: an International Journal
  • Year:
  • 2009

Quantified Score

Hi-index 0.07

Visualization

Abstract

Similarity search in high dimensional space is a nontrivial problem due to the so-called curse of dimensionality. Recent techniques such as Piecewise Aggregate Approximation (PAA), Segmented Means (SMEAN) and Mean-Standard deviation (MS) prove to be very effective in reducing data dimensionality by partitioning dimensions into subsets and extracting aggregate values from each dimension subset. These partition-based techniques have many advantages including very efficient multi-phased approximation while being simple to implement. They, however, are not adaptive to the different characteristics of data in diverse applications. We propose SubSpace Projection (SSP) as a unified framework for these partition-based techniques. SSP projects data onto subspaces and computes a fixed number of salient features with respect to a reference vector. A study of the relationships between query selectivity and the corresponding space partitioning schemes uncovers indicators that can be used to predict the performance of the partitioning configuration. Accordingly, we design a greedy algorithm to efficiently determine a good partitioning of the data dimensions. The results of our extensive experiments indicate that the proposed method consistently outperforms state-of-the-art techniques.