C4.5: programs for machine learning
C4.5: programs for machine learning
A Comparative Analysis of Methods for Pruning Decision Trees
IEEE Transactions on Pattern Analysis and Machine Intelligence
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: concepts and techniques
Data mining: concepts and techniques
IEEE Transactions on Software Engineering - Special section on the seventh international software metrics symposium
Decision Trees: An Overview and Their Use in Medicine
Journal of Medical Systems
A study of cross-validation and bootstrap for accuracy estimation and model selection
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Predicting breast cancer survivability: a comparison of three data mining methods
Artificial Intelligence in Medicine
Uniqueness of medical data mining
Artificial Intelligence in Medicine
Predicting Metastasis in Breast Cancer: Comparing a Decision Tree with Domain Experts
Journal of Medical Systems
Diagnosing Breast Masses in Digital Mammography Using Feature Selection and Ensemble Methods
Journal of Medical Systems
Hi-index | 0.00 |
In medicine, data mining methods such as Decision Tree Induction (DTI) can be trained for extracting rules to predict the outcomes of new patients. However, incompleteness and high dimensionality of stored data are a problem. Canonical Correlation Analysis (CCA) can be used prior to DTI as a dimension reduction technique to preserve the character of the original data by omitting non-essential data. In this study, data from 3949 breast cancer patients were analysed. Raw data were cleaned by running a set of logical rules. Missing values were replaced using the Expectation Maximization algorithm. After dimension reduction with CCA, DTI was employed to analyse the resulting dataset. The validity of the predictive model was confirmed by ten-fold cross validation and the effect of pre-processing was analysed by applying DTI to data without pre-processing. Replacing missing values and using CCA for data reduction dramatically reduced the size of the resulting tree and increased the accuracy of the prediction of breast cancer recurrence.