On changing continuous attributes into ordered discrete attributes
EWSL-91 Proceedings of the European working session on learning on Machine learning
C4.5: programs for machine learning
C4.5: programs for machine learning
Fundamentals of database systems (2nd ed.)
Fundamentals of database systems (2nd ed.)
Class-Driven Statistical Discretization of Continuous Attributes (Extended Abstract)
ECML '95 Proceedings of the 8th European Conference on Machine Learning
Improved use of continuous attributes in C4.5
Journal of Artificial Intelligence Research
Relative Unsupervised Discretization for Regresseion Problems
ECML '00 Proceedings of the 11th European Conference on Machine Learning
Hi-index | 0.00 |
Discretization refers to splitting the range of continuous values into intervals so as to provide useful information about classes. This is usually done by minimizing a goodness measure, subject to constraints such as the maximal number of intervals, the minimal number of examples per interval, or some stopping criterion for splitting. We take a different approach by searching for minimum splits that minimize the number of intervals with respect to a threshold of impurity (i.e., badness). We propose a "total entropy" motivated selection of the "best" split from minimum splits, without requiring additional constraints. Experiments show that the proposed method produces better decision trees.