An effective discretization based on Class-Attribute Coherence Maximization

  • Authors:
  • Min Li;ShaoBo Deng;Shengzhong Feng;Jianping Fan

  • Affiliations:
  • Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong 518055, PR China and Graduate School of Chinese Academy of Sciences, Beijing 100080, PR China and Nancha ...;Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100080, PR China and Graduate School of Chinese Academy of Sciences, B ...;Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong 518055, PR China;Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong 518055, PR China

  • Venue:
  • Pattern Recognition Letters
  • Year:
  • 2011

Quantified Score

Hi-index 0.10

Visualization

Abstract

Discretization of continuous data is one of the important pre-processing tasks in data mining and knowledge discovery. Generally speaking, discretization can lead to improved predictive accuracy of induction algorithms, and the obtained rules are normally shorter and more understandable. In this paper, we present the Class-Attribute Coherence Maximization (CACM) algorithm and the Efficient-CACM algorithm. We have compared the performance of our algorithms with the most relevant discretization algorithm, Fast Class-Attribute Interdependence Maximization (Fast-CAIM) discertization algorithm (Kurgan and Cios, 2003). Empirical evaluation of our algorithms and Fast-CAIM on 12 well-known datasets shows that ours generate the superior discretization scheme, which can significantly improve the classification performance of C4.5 and RBF-SVM classifier. As to the execution time of discretization, ours also prove faster than Fast-CAIM algorithm, with the Efficient-CACM algorithm having the shortest execution time.