Performance evaluation of evolutionary algorithms in classification of biomedical datasets

Authors:
Ajay Kumar Tanwani;Muddassar Farooq
Affiliations:
National University of Computer and Emerging Sciences (FAST-NUCES), Islamabad, Pakistan;National University of Computer and Emerging Sciences (FAST-NUCES), Islamabad, Pakistan
Venue:
Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers
Year:
2009

Citing 15
Cited 2

Accuracy-based learning classifier systems: models, analysis and applications to classification tasks

Evolutionary Computation
Class imbalances versus small disjuncts

ACM SIGKDD Explorations Newsletter - Special issue on learning from imbalanced datasets
Class Noise vs. Attribute Noise: A Quantitative Study

Artificial Intelligence Review
Evolutionary approaches to fuzzy modelling for classification

The Knowledge Engineering Review
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Mining breast cancer data with XCS

Proceedings of the 9th annual conference on Genetic and evolutionary computation
Evolutionary rule-based systems for imbalanced data sets

Soft Computing - A Fusion of Foundations, Methodologies and Applications - Special Issue on Evolutionary and Metaheuristics based Data Mining (EMBDM); Guest Editors: José A. Gámez, María J. del Jesús, José M. Puerta
KEEL: a software tool to assess evolutionary algorithms for data mining problems

Soft Computing - A Fusion of Foundations, Methodologies and Applications - Special Issue on Evolutionary and Metaheuristics based Data Mining (EMBDM); Guest Editors: José A. Gámez, María J. del Jesús, José M. Puerta
Bloat control and generalization pressure using the minimum description length principle for a pittsburgh approach learning classifier system

IWLCS'03-05 Proceedings of the 2003-2005 international conference on Learning classifier systems
Data mining in learning classifier systems: comparing XCS with GAssist

IWLCS'03-05 Proceedings of the 2003-2005 international conference on Learning classifier systems
Data mining with an ant colony optimization algorithm

IEEE Transactions on Evolutionary Computation
Toward a theory of generalization and learning in XCS

IEEE Transactions on Evolutionary Computation
Domain of competence of XCS classifier system in complexity measurement space

IEEE Transactions on Evolutionary Computation
Performance evaluation of fuzzy classifier systems formultidimensional pattern classification problems

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
SLAVE: a genetic learning system based on an iterative approach

IEEE Transactions on Fuzzy Systems

Clonal selection algorithm for classification

ICARIS'11 Proceedings of the 10th international conference on Artificial immune systems
Solving inverse frequent itemset mining with infrequency constraints via large-scale linear programs

ACM Transactions on Knowledge Discovery from Data (TKDD)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Biomedical datasets pose a unique challenge for machine learning and data mining techniques to extract accurate, comprehensible and hidden knowledge from them. In this paper, we comprehensively investigate the role of a biomedical dataset on the classification accuracy of an algorithm. To this end, we quantify the complexity of a biomedical dataset in terms of its missing values, imbalance ratio, noise and information gain. We have performed our experiments using six well-known evolutionary rule learning algorithms: XCS, UCS, GAssist, cAnt-Miner, SLAVE and Ishibuchi, on 31 publicly available biomedical datasets. The results of our experiments show that GAssist gives better classification accuracy among the compared schemes. However, the nature of a biomedical dataset -- not the selection of evolutionary algorithm -- plays a major role in determining the classification accuracy of a dataset. We further show that noise is a dominating factor in determining the complexity of a dataset and it is inversely proportional to the classification accuracy of all the algorithms. The complexity of biomedical dataset will prove useful to researchers in evaluating the classification potential of their dataset for automatic knowledge extraction.