Interpretation of hybrid generative/discriminative algorithms

  • Authors:
  • Jing-Hao Xue;D. Michael Titterington

  • Affiliations:
  • Department of Statistics, University of Glasgow, Glasgow G12 8QQ, UK and Department of Statistical Science, University College London, London WC1E 6BT, UK;Department of Statistics, University of Glasgow, Glasgow G12 8QQ, UK

  • Venue:
  • Neurocomputing
  • Year:
  • 2009

Quantified Score

Hi-index 0.01

Visualization

Abstract

In discriminant analysis, probabilistic generative and discriminative approaches represent two paradigms of statistical modelling and learning. In order to exploit the best of both worlds, hybrid modelling and learning techniques have attracted much research interest recently, one example being the so-called hybrid generative/discriminative algorithm proposed in Raina et al. [Classification with hybrid generative/discriminative models, in: NIPS, 2003] and its multi-class extension [A. Fujino, N. Ueda, K. Saito, A hybrid generative/discriminative approach to text classification with additional information, Inf. Process. Manage. 43(2) (2007) 379-392]. In this paper, we interpret this hybrid algorithm from three perspectives, namely class-conditional probabilities, class-posterior probabilities and loss functions underlying the model. We suggest that the hybrid algorithm is by nature a generative model with its parameters learnt through both generative and discriminative approaches, in the sense that it assumes a scaled data-generation process and uses scaled class-posterior probabilities to perform discrimination. Our suggestion can also be applied to its multi-class extension. In addition, using simulated and real-world data, we compare the performance of the normalised hybrid algorithm as a classifier with that of the naive Bayes classifier and linear logistic regression. Our simulation studies suggest in general the following: if the covariance matrices are diagonal matrices, the naive Bayes classifier performs the best; if the covariance matrices are full matrices, linear logistic regression performs the best. Our studies also suggest that the hybrid algorithm may provide worse performance than either the naive Bayes classifier or linear logistic regression alone.