Sufficient dimensionality reduction with irrelevance statistics

Authors:
Amir Globerson;Gal Chechik;Naftali Tishby
Affiliations:
School of Computer Science and Engineering and Interdisciplinary Center for Neural Computation, The Hebrew University, Jerusalem, Israel;School of Computer Science and Engineering and Interdisciplinary Center for Neural Computation, The Hebrew University, Jerusalem, Israel;School of Computer Science and Engineering and Interdisciplinary Center for Neural Computation, The Hebrew University, Jerusalem, Israel
Venue:
UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Year:
2002

Citing 7
Cited 1

Elements of information theory

Elements of information theory
Principal component neural networks: theory and applications

Principal component neural networks: theory and applications
Inducing Features of Random Fields

IEEE Transactions on Pattern Analysis and Machine Intelligence
Adjustment Learning and Relevant Component Analysis

ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part IV
Sufficient dimensionality reduction

The Journal of Machine Learning Research
A comparison of algorithms for maximum entropy parameter estimation

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Separating Style and Content with Bilinear Models

Neural Computation

A scalable supervised algorithm for dimensionality reduction on streaming data

Information Sciences: an International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

The problem of unsupervised dimensionality reduction of stochastic variables while preserving their most relevant characteristics is fundamental for the analysis of complex data. Unfortunately, this problem is ill defined since natural datasets inherently contain alternative underlying structures. In this paper we address this problem by extending the recently introduced "Sufficient Dimensionality Reduction" feature extraction method [7], to use "side information" about irrelevant structures in the data. The use of such irrelevance information was recently successfully demonstrated in the context of clustering via the Information Bottleneck method [1]. Here we use this side-information framework to identify continuous features whose measurements are maximally informative for the main data set, but carry as little information as possible on the irrelevance data set. In statistical terms this can be understood as extracting statistics which are maximally sufficient for the main dataset, while simultaneously maximally ancillary for the irrelevance dataset. We formulate this problem as a tradeoff optimization problem and describe its analytic and algorithmic solutions. Our method is demonstrated on a synthetic example and on a real world application of face images, showing its superiority over other methods such as Oriented Principal Component Analysis.