Genetic programming: on the programming of computers by means of natural selection
Genetic programming: on the programming of computers by means of natural selection
System identification (2nd ed.): theory for the user
System identification (2nd ed.): theory for the user
Foundations of genetic programming
Foundations of genetic programming
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Introduction to Evolutionary Computing
Introduction to Evolutionary Computing
Genetic Algorithms and Genetic Programming: Modern Concepts and Practical Applications
Genetic Algorithms and Genetic Programming: Modern Concepts and Practical Applications
A study of cross-validation and bootstrap for accuracy estimation and model selection
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Classification of tumor marker values using heuristic data mining methods
Proceedings of the 12th annual conference companion on Genetic and evolutionary computation
Proceedings of the 13th annual conference companion on Genetic and evolutionary computation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Hi-index | 0.00 |
In this paper we discuss the effects of using pre-clustered data on the identification of estimation models for cancer diagnoses. Based on patients' data records including standard blood parameters, tumor markers, and information about the diagnosis of tumors, the goal is to identify mathematical models for estimating cancer diagnoses. We have applied a hybrid clustering and classification approach that first identifies data clusters (using standard patient data and tumor markers) and then learns prediction models on the basis of these data clusters. In the empirical section we analyze the clusters of patient data samples formed using k-means clustering: The optimal number of clusters is identified, and we investigate the homogeneity of these clusters. Several evolutionary modeling approaches implemented in HeuristicLab have been applied for subsequently identifying estimators for selected cancer diagnoses: Linear regression, k-nearest neighbor learning, artificial neural networks, and support vector machines (all optimized using evolutionary algorithms) as well as genetic programming. As we show in the results section, the investigated diagnoses of breast cancer, melanoma, and respiratory system cancer can be estimated correctly in up to 84.2%, 80.3%, and 94.1% of the analyzed test cases, respectively; without tumor markers up to 78.2%, 78%, and 93.3% of the test samples are correctly estimated, respectively.