C4.5: programs for machine learning
C4.5: programs for machine learning
An introduction to genetic algorithms
An introduction to genetic algorithms
Machine Learning
Genetic Algorithms in Search, Optimization and Machine Learning
Genetic Algorithms in Search, Optimization and Machine Learning
Decision Trees: An Overview and Their Use in Medicine
Journal of Medical Systems
A decision-theoretic generalization of on-line learning and an application to boosting
EuroCOLT '95 Proceedings of the Second European Conference on Computational Learning Theory
YALE: rapid prototyping for complex data mining tasks
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Computational Methods of Feature Selection (Chapman & Hall/Crc Data Mining and Knowledge Discovery Series)
Feature selection and classification model construction on type 2 diabetic patients' data
Artificial Intelligence in Medicine
Improving data mining utility with projective sampling
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
The feature selection problem: traditional methods and a new algorithm
AAAI'92 Proceedings of the tenth national conference on Artificial intelligence
Induction of selective Bayesian classifiers
UAI'94 Proceedings of the Tenth international conference on Uncertainty in artificial intelligence
Automated Diagnosis of Coronary Artery Disease Based on Data Mining and Fuzzy Modeling
IEEE Transactions on Information Technology in Biomedicine
Comparison of tree-based methods for prognostic stratification of survival data
Artificial Intelligence in Medicine
Data mining techniques for cancer detection using serum proteomic profiling
Artificial Intelligence in Medicine
Machine learning for medical diagnosis: history, state of the art and perspective
Artificial Intelligence in Medicine
Hi-index | 0.00 |
Objectives: Despite medical advances, infectious diseases are still a major cause of mortality and morbidity, disability and socio-economic upheaval worldwide. Early diagnosis, appropriate choice and immediate initiation of antibiotic therapy can greatly affect the outcome of any kind of infection. Phagocytes play a central role in the innate immune response of the organism to infection. They comprise the first-line of defense against infectious intruders in our body, being able to produce large quantities of reactive oxygen species, which can be detected by means of chemiluminescence (CL). The data preparation approach implemented in this work corresponds to a dynamic assessment of phagocytic respiratory burst localization in a luminol-enhanced whole blood CL system. We have previously applied this approach to the problem of identifying various intra-abdominal pathological processes afflicting peritoneal dialysis patients in the Nephrology department and demonstrated 84.6% predictive accuracy with the C4.5 decision-tree algorithm. In this study, we apply the CL-based approach to a larger sample of patients from two departments (Nephrology and Internal Medicine) with the aim of finding the most effective and interpretable feature sets and classification models for a fast and accurate identification of several infectious diseases. Materials and methods: Whole blood samples were collected from 78 patients (comprising 115 instances) with respiratory infections, infections associated with renal replacement therapy and patients without infections. CL kinetic parameters were calculated for each case, which was assigned into a specific clinical group according to the available clinical diagnostics. Feature selection wrapper and filter methods were applied to remove the irrelevant and redundant features and to improve the predictive performance of disease classification algorithms. Three data mining algorithms, C4.5 (J48) decision tree, support vector machines and naive Bayes classifier were applied for inducing disease classification models and their performance in classifying three clinical groups was evaluated by 10 runs of a stratified 10-fold cross-validation. Results and conclusions: The results demonstrate that the predictive power of the best models obtained with the three evaluated algorithms after feature selection was found to be in the range of 63.38+/-2.18-70.68+/-1.43%. The highest disease classification accuracy was reached by C4.5, which also provides the most informative model in the form of a decision tree, and the lowest accuracy was obtained with naive Bayes. The feature selection method attaining the best classification performance was the wrapper method in forward direction. Moreover, the classification models exposed biological patterns specific to the clinical states and the predictive features selected were found to be characteristic of a specific disorder. Based on these encouraging results, we believe that the CL-based data pre-processing approach combined with the wrapper forward feature selection procedure and the C4.5 decision-tree algorithm has a clear potential to become a fast, informative, and sensitive tool for predictive diagnostics of infectious diseases in clinics.