C4.5: programs for machine learning
C4.5: programs for machine learning
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Dimensionality Reduction in Unsupervised Learning of Conditional Gaussian Networks
IEEE Transactions on Pattern Analysis and Machine Intelligence
Feature Selection for Knowledge Discovery and Data Mining
Feature Selection for Knowledge Discovery and Data Mining
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Making Sense of Data: A Practical Guide to Exploratory Data Analysis and Data Mining
Making Sense of Data: A Practical Guide to Exploratory Data Analysis and Data Mining
Expert Systems with Applications: An International Journal
Design of a hybrid system for the diabetes and heart diseases
Expert Systems with Applications: An International Journal
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
Predicting breast cancer survivability: a comparison of three data mining methods
Artificial Intelligence in Medicine
A hybrid prediction model with F-score feature selection for type II Diabetes databases
Proceedings of the 1st Amrita ACM-W Celebration on Women in Computing in India
A hybrid machine learning method and its application in municipal waste prediction
ICDM'13 Proceedings of the 13th international conference on Advances in Data Mining: applications and theoretical aspects
Review: Knowledge discovery in medicine: Current issue and future trend
Expert Systems with Applications: An International Journal
Hi-index | 12.05 |
A wide range of computational methods and tools for data analysis are available. In this study we took advantage of those available technological advancements to develop prediction models for the prediction of a Type-2 Diabetic Patient. We aim to investigate how the diabetes incidents are affected by patients' characteristics and measurements. Efficient predictive modeling is required for medical researchers and practitioners. This study proposes Hybrid Prediction Model (HPM) which uses Simple K-means clustering algorithm aimed at validating chosen class label of given data (incorrectly classified instances are removed, i.e. pattern extracted from original data) and subsequently applying the classification algorithm to the result set. C4.5 algorithm is used to build the final classifier model by using the k-fold cross-validation method. The Pima Indians diabetes data was obtained from the University of California at Irvine (UCI) machine learning repository datasets. A wide range of different classification methods have been applied previously by various researchers in order to find the best performing algorithm on this dataset. The accuracies achieved have been in the range of 59.4-84.05%. However the proposed HPM obtained a classification accuracy of 92.38%. In order to evaluate the performance of the proposed method, sensitivity and specificity performance measures that are used commonly in medical classification studies were used.