C4.5: programs for machine learning
C4.5: programs for machine learning
Multivariate discretization of continuous variables for set mining
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
An Evolutionary Algorithm Using Multivariate Discretization for Decision Rule Induction
PKDD '99 Proceedings of the Third European Conference on Principles of Data Mining and Knowledge Discovery
ChiMerge: discretization of numeric attributes
AAAI'92 Proceedings of the tenth national conference on Artificial intelligence
Wrapper discretization by means of estimation of distribution algorithms
Intelligent Data Analysis
An ICA-Based multivariate discretization algorithm
KSEM'06 Proceedings of the First international conference on Knowledge Science, Engineering and Management
Hi-index | 0.00 |
In supervised learning, discretization of the continuous explanatory attributes enhances the accuracy of decision tree induction algorithms and naive Bayes classifier. Many discretization methods have been developped, leading to precise and comprehensible evaluations of the amount of information contained in one single attribute with respect to the target one. In this paper, we discuss the multivariate notion of neighborhood, extending the univariate notion of interval. We propose an evaluation criterion of bipartitions, which is based on the Minimum Description Length (MDL) principle [1], and apply it recursively. The resulting discretization method is thus able to exploit correlations between continuous attributes. Its accuracy and robustness are evaluated on real and synthetic data sets.