Relative Unsupervised Discretization for Regresseion Problems

Authors:
Marcus-Christopher Ludl;Gerhard Widmer
Affiliations:
-;-
Venue:
ECML '00 Proceedings of the 11th European Conference on Machine Learning
Year:
2000

Citing 5
Cited 2

C4.5: programs for machine learning

C4.5: programs for machine learning
A New Criterion in Selection and Discretization of Attributes for the Generation of Decision Trees

IEEE Transactions on Pattern Analysis and Machine Intelligence
Class-Driven Statistical Discretization of Continuous Attributes (Extended Abstract)

ECML '95 Proceedings of the 8th European Conference on Machine Learning
Bayesian Network Classification with Continuous Attributes: Getting the Best of Both Discretization and Parametric Fitting

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Minimum splits based discretization for continuous features

IJCAI'97 Proceedings of the Fifteenth international joint conference on Artifical intelligence - Volume 2

Relative Unsupervised Discretization for Association Rule Mining

PKDD '00 Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery
Generating Linear Regression Rules from Neural Networks Using Local Least Squares Approximation

IWANN '01 Proceedings of the 6th International Work-Conference on Artificial and Natural Neural Networks: Connectionist Models of Neurons, Learning Processes and Artificial Intelligence-Part I

Quantified Score

Hi-index	0.00

Visualization

Abstract

The paper describes a new, context-sensitive discretization algorithm that combines aspects of unsupervised (class-blind) and supervised methods. The algorithm is applicable to a wide range of machine learning and data mining problems where continuous attributes need to be discretized. In this paper, we evaluate its utility in a regression-by-classification setting. Preliminary experimental results indicate that the decision trees induced using this discretization strategy are significantly smaller and thus more comprehensible than those learned with standard discretization methods, while losing only minimally in numerical prediction accuracy. This may be a considerable advantage in machine learning and data mining applications where comprehensibility is an issue.