Asymptotic convergence analysis of the projection approximation subspace tracking algorithms
Signal Processing - Special issue on subspace methods, part I: array signal processing and subspace computations
Mixtures of probabilistic principal component analyzers
Neural Computation
Journal of Global Optimization
Computing Clusters of Correlation Connected objects
SIGMOD '04 Proceedings of the 2004 ACM SIGMOD international conference on Management of data
Feature selection, L1 vs. L2 regularization, and rotational invariance
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Locally adaptive metrics for clustering high dimensional data
Data Mining and Knowledge Discovery
A tutorial on spectral clustering
Statistics and Computing
Sparse principal component analysis via regularized low rank matrix approximation
Journal of Multivariate Analysis
Spectral Curvature Clustering (SCC)
International Journal of Computer Vision
ACM Transactions on Knowledge Discovery from Data (TKDD)
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
IEEE Transactions on Information Theory
Improving the robustness to outliers of mixtures of probabilistic PCAs
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Predictive Subspace Clustering
ICMLA '11 Proceedings of the 2011 10th International Conference on Machine Learning and Applications and Workshops - Volume 01
A survey on enhanced subspace clustering
Data Mining and Knowledge Discovery
Hi-index | 0.00 |
In several application domains, high-dimensional observations are collected and then analysed in search for naturally occurring data clusters which might provide further insights about the nature of the problem. In this paper we describe a new approach for partitioning such high-dimensional data. Our assumption is that, within each cluster, the data can be approximated well by a linear subspace estimated by means of a principal component analysis (PCA). The proposed algorithm, Predictive Subspace Clustering (PSC) partitions the data into clusters while simultaneously estimating cluster-wise PCA parameters. The algorithm minimises an objective function that depends upon a new measure of influence for PCA models. A penalised version of the algorithm is also described for carrying our simultaneous subspace clustering and variable selection. The convergence of PSC is discussed in detail, and extensive simulation results and comparisons to competing methods are presented. The comparative performance of PSC has been assessed on six real gene expression data sets for which PSC often provides state-of-art results.