International Journal of Man-Machine Studies - Special Issue: Knowledge Acquisition for Knowledge-based Systems. Part 5
Wrappers for performance enhancement and oblivious decision graphs
Wrappers for performance enhancement and oblivious decision graphs
A Comparative Analysis of Methods for Pruning Decision Trees
IEEE Transactions on Pattern Analysis and Machine Intelligence
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Machine Learning
A study of cross-validation and bootstrap for accuracy estimation and model selection
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Hi-index | 0.00 |
In this study, the robustness of the C4.5-decision tree algorithm was applied to sow herd datasets for investigating the limitations of this analysing technique. First, simulated sow herd datasets including inconsistent farmer culling policies which appear in real datasets were classified. These results were compared with very uniform and simple replacement rules. Furthermore, an optimisation of different pruning methods which can be changed in the decision tree tool was done. The evaluation parameters of all classifications were calculated with the stratified fold cross-validation and varying the number of folds showed that 10 folds were an appropriate number of subdividing the datasets. By simplifying the sow selection in the simulation, the sensitivity and error rate of the datasets showed improved values. In particular, datasets with randomly selected and inconsistent culling rules showed less sensitivities between 20 and 53%. A comparison of the classification of datasets with pruning or without pruning showed that with pruning, smaller sizes of trees resulted. The pruning class had a decrease of 23 leaves and 46 nodes, as compared to the without-pruning class, in the highest branching example. Differences exist between the two different pruning methods in the classification parameters and also in the tree size in dependence of the sow herd performance level and the size of the datasets.