Convergence of an annealing algorithm
Mathematical Programming: Series A and B
Synthesizing Statistical Knowledge from Incomplete Mixed-Mode Data
IEEE Transactions on Pattern Analysis and Machine Intelligence
Inferring decision trees using the minimum description length principle
Information and Computation
On changing continuous attributes into ordered discrete attributes
EWSL-91 Proceedings of the European working session on learning on Machine learning
C4.5: programs for machine learning
C4.5: programs for machine learning
Modern heuristic techniques for combinatorial problems
Efficient agnostic PAC-learning with simple hypothesis
COLT '94 Proceedings of the seventh annual conference on Computational learning theory
Feature Subset Selection within a Simulated Annealing DataMining Algorithm
Journal of Intelligent Information Systems
Clustering Algorithms
Improved use of continuous attributes in C4.5
Journal of Artificial Intelligence Research
Generalised RBF Networks Trained Using an IBL Algorithm for Mining Symbolic Data
PAKDD '01 Proceedings of the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining
Symbolic adaptive neuro-fuzzy inference for data mining of heterogenous data
Intelligent Data Analysis
Search intensity versus search diversity: a false trade off?
Applied Intelligence
Hi-index | 0.00 |
An introduction to the approaches used to discretisecontinuous database features is given, together with a discussion ofthe potential benefits of such techniques. These benefits areinvestigated by applying discretisation algorithms to two largecommercial databases; the discretisations yielded are then evaluatedusing a simulated annealing based data mining algorithm. The resultsproduced suggest that dramatic reductions in problem size may beachieved, yielding improvements in the speed of the data miningalgorithm. However, it is also demonstrated under certaincircumstances that the discretisation produced may give an increasein problem size or allow overfitting by the data mining algorithm.Such cases, within which often only a small proportion of thedatabase belongs to the class of interest, highlight the need bothfor caution when producing discretisations and for the development ofmore robust discretisation algorithms.