REDUS: finding reducible subspaces in high dimensional data

Authors:
Xiang Zhang;Feng Pan;Wei Wang
Affiliations:
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA;University of North Carolina at Chapel Hill, Chapel Hill, NC, USA;University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Venue:
Proceedings of the 17th ACM conference on Information and knowledge management
Year:
2008

Citing 15
Cited 1

Elements of information theory

Elements of information theory
Beyond uniformity and independence: analysis of R-trees using the concept of fractal dimension

PODS '94 Proceedings of the thirteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems
Selection of relevant features and examples in machine learning

Artificial Intelligence - Special issue on relevance
Self-spacial join selectivity estimation using fractal concepts

ACM Transactions on Information Systems (TOIS)
Finding generalized projected clusters in high dimensional spaces

SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Using the fractal dimension to cluster datasets

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
Feature Selection for Knowledge Discovery and Data Mining

Feature Selection for Knowledge Discovery and Data Mining
Estimating the Intrinsic Dimension of Data with a Fractal-Based Method

IEEE Transactions on Pattern Analysis and Machine Intelligence
Laplacian Eigenmaps for dimensionality reduction and data representation

Neural Computation
Computing Clusters of Correlation Connected objects

SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
CURLER: finding and visualizing nonlinear correlation clusters

Proceedings of the 2005 ACM SIGMOD international conference on Management of data
Dimension induced clustering

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
An Algorithm for Finding Intrinsic Dimensionality of Data

IEEE Transactions on Computers
CARE: Finding Local Linear Correlations in High Dimensional Data

ICDE '08 Proceedings of the 2008 IEEE 24th International Conference on Data Engineering
Searching for interacting features

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence

Subspace correlation clustering: finding locally correlated dimensions in subspace projections of the data

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Finding latent patterns in high dimensional data is an important research problem with numerous applications. The most well known approaches for high dimensional data analysis are feature selection and dimensionality reduction. Being widely used in many applications, these methods aim to capture global patterns and are typically performed in the full feature space. In many emerging applications, however, scientists are interested in the local latent patterns held by feature subspaces, which may be invisible via any global transformation. In this paper, we investigate the problem of finding strong linear and nonlinear correlations hidden in feature subspaces of high dimensional data. We formalize this problem as identifying reducible subspaces in the full dimensional space. Intuitively, a reducible subspace is a feature subspace whose intrinsic dimensionality is smaller than the number of features. We present an effective algorithm, REDUS, for finding the reducible subspaces. Two key components of our algorithm are finding the overall reducible subspace, and uncovering the individual reducible subspaces from the overall reducible subspace. A broad experimental evaluation demonstrates the effectiveness of our algorithm.