Feature selection and dualities in maximum entropy discrimination

Authors:
Tony Jebara;Tommi Jaakkola
Affiliations:
MIT Media Lab, Massachusetts Institute of Technology, Cambridge, MA;MIT Media Lab, Massachusetts Institute of Technology, Cambridge, MA
Venue:
UAI'00 Proceedings of the Sixteenth conference on Uncertainty in artificial intelligence
Year:
2000

Citing 8
Cited 4

Inducing Features of Random Fields

IEEE Transactions on Pattern Analysis and Machine Intelligence
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
Boosting as entropy projection

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
Exploiting generative models in discriminative classifiers

Proceedings of the 1998 conference on Advances in neural information processing systems II
Maximum conditional likelihood via bound maximization and the CEM algorithm

Proceedings of the 1998 conference on Advances in neural information processing systems II
Transductive Inference for Text Classification using Support Vector Machines

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Using the Fisher Kernel Method to Detect Remote Protein Homologies

Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
Predicting Time Series with Support Vector Machines

ICANN '97 Proceedings of the 7th International Conference on Artificial Neural Networks

Entropy on covers

Data Mining and Knowledge Discovery
An iterative SVM approach to feature selection and classification in high-dimensional datasets

Pattern Recognition
Spatial distance join based feature selection

Engineering Applications of Artificial Intelligence
Multi-view maximum entropy discrimination

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Incorporating feature selection into a classification or regression method often carries a number of advantages. In this paper we formalize feature selection specifically from a discriminative perspective of improving classification/ regression accuracy. The feature selection method is developed as an extension to the recently proposed maximum entropy discrimination (MED) framework. We describe MED as a flexible (Bayesian) regularization approach that subsumes, e.g., support vector classification, regression and exponential family models. For brevity, we restrict ourselves primarily to feature selection in the context of linear classification/regression methods and demonstrate that the proposed approach indeed carries substantial improvements in practice. Moreover, we discuss and develop various extensions of feature selection, including the problem of dealing with example specific but unobserved degrees of freedom - alignments or invariants.