A non-parametric semi-supervised discretization method

  • Authors:
  • Alexis Bondu;Marc Boullé;Vincent Lemaire

  • Affiliations:
  • EDF R&D (ICAME/SOAD), 1 av. Géénéral de Gaulle, 92140, Clamart, France;ORANGE LABS (TECH/EASY/TSI), 2 av. Pierre Marzin, 22300, Lannion, France;ORANGE LABS (TECH/EASY/TSI), 2 av. Pierre Marzin, 22300, Lannion, France

  • Venue:
  • Knowledge and Information Systems
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Semi-supervised classification methods aim to exploit labeled and unlabeled examples to train a predictive model. Most of these approaches make assumptions on the distribution of classes. This article first proposes a new semi-supervised discretization method, which adopts very low informative prior on data. This method discretizes the numerical domain of a continuous input variable, while keeping the information relative to the prediction of classes. Then, an in-depth comparison of this semi-supervised method with the original supervised MODL approach is presented. We demonstrate that the semi-supervised approach is asymptotically equivalent to the supervised approach, improved with a post-optimization of the intervals bounds location.