Knowledge discovery in databases: an overview
AI Magazine
Principles of data mining
Machine Learning
The Alternating Decision Tree Learning Algorithm
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Practical Statistics for Medical Research
Practical Statistics for Medical Research
Chronic hepatitis and cirrhosis classification using SNP data, decision tree and decision rule
ICCSA'07 Proceedings of the 2007 international conference on Computational science and its applications - Volume Part III
Hi-index | 0.00 |
Data mining is the analysis of experimental datasets to extract trends and relationships that can be meaningful for the user. In genetic studies these techniques have revealed interesting findings, especially in the heritable predisposition to contract specific diseases. One of these diseases which is still under extensive analysis is pre-eclampsia, a progressive disorder which occurs during pregnancy and soon after the birth, affecting both the mothers and their babies. There are many choices to be made in the application of the various data mining techniques that may be used to study general genotypephenotype associations. The aim of this paper is to describe the general framework that we adopted in the application of decision tree algorithms to the analysis of SNPs data related to cases of pre-eclampsia. The results show the validity of this methodology to detect a subset of attributes associated with the predictable variable, providing a reduction in the size of the dataset. Moreover, from the clinical point of view, it confirmed the medical interpretation of the 'corrected birth-weight centile' (CBC) value of 10 being a meaningful cut-off and confirmed association between an infant's CBC and the 'week of delivery' parameter. We hope that the generic framework described here will be of use to other researchers analysing such data.