Elements of information theory
Elements of information theory
Principal component neural networks: theory and applications
Principal component neural networks: theory and applications
Inducing Features of Random Fields
IEEE Transactions on Pattern Analysis and Machine Intelligence
Adjustment Learning and Relevant Component Analysis
ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
Sufficient dimensionality reduction
The Journal of Machine Learning Research
A comparison of algorithms for maximum entropy parameter estimation
COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Separating Style and Content with Bilinear Models
Neural Computation
A scalable supervised algorithm for dimensionality reduction on streaming data
Information Sciences: an International Journal
Hi-index | 0.00 |
The problem of unsupervised dimensionality reduction of stochastic variables while preserving their most relevant characteristics is fundamental for the analysis of complex data. Unfortunately, this problem is ill defined since natural datasets inherently contain alternative underlying structures. In this paper we address this problem by extending the recently introduced "Sufficient Dimensionality Reduction" feature extraction method [7], to use "side information" about irrelevant structures in the data. The use of such irrelevance information was recently successfully demonstrated in the context of clustering via the Information Bottleneck method [1]. Here we use this side-information framework to identify continuous features whose measurements are maximally informative for the main data set, but carry as little information as possible on the irrelevance data set. In statistical terms this can be understood as extracting statistics which are maximally sufficient for the main dataset, while simultaneously maximally ancillary for the irrelevance dataset. We formulate this problem as a tradeoff optimization problem and describe its analytic and algorithmic solutions. Our method is demonstrated on a synthetic example and on a real world application of face images, showing its superiority over other methods such as Oriented Principal Component Analysis.