C4.5: programs for machine learning
C4.5: programs for machine learning
Discovering data mining: from concept to implementation
Discovering data mining: from concept to implementation
Data mining: practical machine learning tools and techniques with Java implementations
Data mining: practical machine learning tools and techniques with Java implementations
Problems with Mining Medical Data
COMPSAC '00 24th International Computer Software and Applications Conference
Modeling medical prognosis: survival analysis techniques
Computers and Biomedical Research
Using AUC and Accuracy in Evaluating Learning Algorithms
IEEE Transactions on Knowledge and Data Engineering
Analysis of Breast Cancer Using Data Mining and Statistical Techniques
SNPD-SAWN '05 Proceedings of the Sixth International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing and First ACIS International Workshop on Self-Assembling Wireless Networks
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Improving Mining of Medical Data by Outliers Prediction
CBMS '05 Proceedings of the 18th IEEE Symposium on Computer-Based Medical Systems
Mining risk patterns in medical data
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
The effect of imbalanced data sets on LDA: A theoretical and empirical analysis
Pattern Recognition
Expert Systems with Applications: An International Journal
ADCOM '07 Proceedings of the 15th International Conference on Advanced Computing and Communications
Breast cancer survivability via AdaBoost algorithms
HDKM '08 Proceedings of the second Australasian workshop on Health data and knowledge management - Volume 80
SMOTE: synthetic minority over-sampling technique
Journal of Artificial Intelligence Research
Learning when training data are costly: the effect of class distribution on tree induction
Journal of Artificial Intelligence Research
A study of cross-validation and bootstrap for accuracy estimation and model selection
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Predicting breast cancer survivability: a comparison of three data mining methods
Artificial Intelligence in Medicine
Ensemble methods for noise elimination in classification problems
MCS'03 Proceedings of the 4th international conference on Multiple classifier systems
Combining SVM classifiers for email anti-spam filtering
IWANN'07 Proceedings of the 9th international work conference on Artificial neural networks
Text categorization based on artificial neural networks
ICONIP'06 Proceedings of the 13th international conference on Neural information processing - Volume Part III
DEXA'05 Proceedings of the 16th international conference on Database and Expert Systems Applications
A survey of prediction models for breast cancer survivability
Proceedings of the 2nd International Conference on Interaction Sciences: Information Technology, Culture and Human
Expert Systems with Applications: An International Journal
Breast Alert: An On-line Tool for Predicting the Lifetime Risk of Women Breast Cancer
Journal of Medical Systems
Robust predictive model for evaluating breast cancer survivability
Engineering Applications of Artificial Intelligence
Review: Knowledge discovery in medicine: Current issue and future trend
Expert Systems with Applications: An International Journal
Hi-index | 12.05 |
Due to the difficulties of outlier and skewed data, the prediction of breast cancer survivability has presented many challenges in the field of data mining and pattern precognition, especially in medical research. To solve these problems, we have proposed a hybrid approach to generating higher quality data sets in the creation of improved breast cancer survival prediction models. This approach comprises two main steps: (1) utilization of an outlier filtering approach based on C-Support Vector Classification (C-SVC) to identify and eliminate outlier instances; and (2) application of an over-sampling approach using over-sampling with replacement to increase the number of instances in the minority class. In order to assess the capability and effectiveness of the proposed approach, several measurement methods including basic performance (e.g., accuracy, sensitivity, and specificity), Area Under the receiver operating characteristic Curve (AUC) and F-measure were utilized. Moreover, a 10-fold cross-validation method was used to reduce the bias and variance of the results of breast cancer survivability prediction models. Results have indicated that the proposed approach leads to improving the performance of breast cancer survivability prediction models by up to 28.34% due to the improved training data space.