Correlation maximisation-based discretisation for supervised classification

  • Authors:
  • Qiusha Zhu;Lin Lin;Mei-Ling Shyu

  • Affiliations:
  • Department of Electrical and Computer Engineering, University of Miami, Coral Gables, FL 33124, USA.;Department of Electrical and Computer Engineering, University of Miami, Coral Gables, FL 33124, USA.;Department of Electrical and Computer Engineering, University of Miami, Coral Gables, FL 33124, USA

  • Venue:
  • International Journal of Business Intelligence and Data Mining
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes a novel supervised discretisation algorithm based on Correlation Maximisation (CM) using Multiple Correspondence Analysis (MCA). MCA is an effective technique to capture the correlation between multiple variables. For each numeric feature, the proposed discretisation algorithm utilises MCA to measure the correlations between feature intervals/items and classes, and the set of cut-points yielding the maximum correlation is chosen as the discretisation scheme for that feature. Therefore, the discretised feature can not only produce a concise summarisation of the original numeric feature but also provide the maximum correlation information to predict class labels. Experiments are conducted by comparing to seven state-of-the-art supervised discretisation algorithms using six well-known classifiers on 19 UCI data sets. Experimental results demonstrate that the proposed discretisation algorithm can automatically generate a set of features (feature intervals) that produce the best classification results on average.