A predictive deviance criterion for selecting a generative model in semi-supervised classification

Authors:
Vincent Vandewalle;Christophe Biernacki;Gilles Celeux;GéRard Govaert
Affiliations:
INRIA, France and UMR 8524, CNRS & Universitéé Lille 1, 59655 Villeneuve d'Ascq, France and EA 2694, Université Lille 2, 59045 Lille, France;INRIA, France and EA 2694, Université Lille 2, 59045 Lille, France;INRIA, France;UMR 7253, CNRS & Universitéé Technologique de Compiègne, 60205 Compiègne, France
Venue:
Computational Statistics & Data Analysis
Year:
2013

Citing 4
Cited 0

The nature of statistical learning theory

The nature of statistical learning theory
Transductive Inference for Text Classification using Support Vector Machines

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
A Mixture Model and EM-Based Algorithm for Class Discovery, Robust Classification, and Outlier Rejection in Mixed Labeled/Unlabeled Data Sets

IEEE Transactions on Pattern Analysis and Machine Intelligence
Selection of Generative Models in Classification

IEEE Transactions on Pattern Analysis and Machine Intelligence

Quantified Score

Hi-index	0.03

Visualization

Abstract

Semi-supervised classification can help to improve generative classifiers by taking into account the information provided by the unlabeled data points, especially when there are far more unlabeled data than labeled data. The aim is to select a generative classification model using both unlabeled and labeled data. A predictive deviance criterion, AIC"c"o"n"d, aiming to select a parsimonious and relevant generative classifier in the semi-supervised context is proposed. In contrast to standard information criteria such as AIC and BIC, AIC"c"o"n"d is focused on the classification task, since it attempts to measure the predictive power of a generative model by approximating its predictive deviance. However, it avoids the computational cost of cross-validation criteria, which make repeated use of the EM algorithm. AIC"c"o"n"d is proved to have consistency properties that ensure its parsimony when compared with the Bayesian Entropy Criterion (BEC), whose focus is similar to that of AIC"c"o"n"d. Numerical experiments on both simulated and real data sets show that the behavior of AIC"c"o"n"d as regards the selection of variables and models, is encouraging when it is compared to the competing criteria.