Optimisation of the decision tree technique applied to simulated sow herd datasets

Authors:
K. Kirchner;K. -H. Tölle;J. Krieter
Affiliations:
Institute of Animal Breeding and Husbandry, Christian-Albrechts-University, Hermann-Rodewald-Straíe 6, 24118 Kiel, Germany;Institute of Animal Breeding and Husbandry, Christian-Albrechts-University, Hermann-Rodewald-Straíe 6, 24118 Kiel, Germany;Institute of Animal Breeding and Husbandry, Christian-Albrechts-University, Hermann-Rodewald-Straíe 6, 24118 Kiel, Germany
Venue:
Computers and Electronics in Agriculture
Year:
2006

Citing 7
Cited 0

Simplifying decision trees

International Journal of Man-Machine Studies - Special Issue: Knowledge Acquisition for Knowledge-based Systems. Part 5
Wrappers for performance enhancement and oblivious decision graphs

Wrappers for performance enhancement and oblivious decision graphs
A Comparative Analysis of Methods for Pruning Decision Trees

IEEE Transactions on Pattern Analysis and Machine Intelligence
Data mining: practical machine learning tools and techniques with Java implementations

Data mining: practical machine learning tools and techniques with Java implementations
An Empirical Comparison of Selection Measures for Decision-Tree Induction

Machine Learning
Induction of Decision Trees

Machine Learning
A study of cross-validation and bootstrap for accuracy estimation and model selection

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this study, the robustness of the C4.5-decision tree algorithm was applied to sow herd datasets for investigating the limitations of this analysing technique. First, simulated sow herd datasets including inconsistent farmer culling policies which appear in real datasets were classified. These results were compared with very uniform and simple replacement rules. Furthermore, an optimisation of different pruning methods which can be changed in the decision tree tool was done. The evaluation parameters of all classifications were calculated with the stratified fold cross-validation and varying the number of folds showed that 10 folds were an appropriate number of subdividing the datasets. By simplifying the sow selection in the simulation, the sensitivity and error rate of the datasets showed improved values. In particular, datasets with randomly selected and inconsistent culling rules showed less sensitivities between 20 and 53%. A comparison of the classification of datasets with pruning or without pruning showed that with pruning, smaller sizes of trees resulted. The pruning class had a decrease of 23 leaves and 46 nodes, as compared to the without-pruning class, in the highest branching example. Differences exist between the two different pruning methods in the classification parameters and also in the tree size in dependence of the sow herd performance level and the size of the datasets.