Genetic programming: on the programming of computers by means of natural selection
Genetic programming: on the programming of computers by means of natural selection
Dynamic Training Subset Selection for Supervised Learning in Genetic Programming
PPSN III Proceedings of the International Conference on Evolutionary Computation. The Third Conference on Parallel Problem Solving from Nature: Parallel Problem Solving from Nature
Evolving Receiver Operating Characteristics for Data Fusion
EuroGP '01 Proceedings of the 4th European Conference on Genetic Programming
Ideal Evaluation from Coevolution
Evolutionary Computation
A Monotonic Archive for Pareto-Coevolution
Evolutionary Computation
CEC '02 Proceedings of the Evolutionary Computation on 2002. CEC '02. Proceedings of the 2002 Congress - Volume 02
Pareto-coevolutionary genetic programming for problem decomposition in multi-class classification
Proceedings of the 9th annual conference on Genetic and evolutionary computation
Learning when training data are costly: the effect of class distribution on tree induction
Journal of Artificial Intelligence Research
Training binary GP classifiers efficiently: a Pareto-coevolutionary approach
EuroGP'07 Proceedings of the 10th European conference on Genetic programming
Fitness functions in genetic programming for classification with unbalanced data
AI'07 Proceedings of the 20th Australian joint conference on Advances in artificial intelligence
A comparison of linear genetic programming and neural networks inmedical data mining
IEEE Transactions on Evolutionary Computation
Training genetic programming on half a million patterns: an example from anomaly detection
IEEE Transactions on Evolutionary Computation
Classifying SSH encrypted traffic with minimum packet header features using genetic programming
Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers
CEC'09 Proceedings of the Eleventh conference on Congress on Evolutionary Computation
AUC analysis of the pareto-front using multi-objective GP for classification with unbalanced data
Proceedings of the 12th annual conference on Genetic and evolutionary computation
Classification as clustering: A pareto cooperative-competitive gp approach
Evolutionary Computation
Evolving ensembles in multi-objective genetic programming for classification with unbalanced data
Proceedings of the 13th annual conference on Genetic and evolutionary computation
Symbiotic coevolutionary genetic programming: a benchmarking study under large attribute spaces
Genetic Programming and Evolvable Machines
Genetic programming for classification with unbalanced data
EuroGP'10 Proceedings of the 13th European conference on Genetic Programming
Improving robustness of multiple-objective genetic programming for object detection
AI'11 Proceedings of the 24th international conference on Advances in Artificial Intelligence
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)
A bootstrapping approach to reduce over-fitting in genetic programming
Proceedings of the 15th annual conference companion on Genetic and evolutionary computation
ACM Transactions on Intelligent Systems and Technology (TIST) - Survey papers, special sections on the semantic adaptive social web, intelligent systems for health informatics, regular papers
Hi-index | 0.00 |
The problem of evolving binary classification models under increasingly unbalanced data sets is approached by proposing a strategy consisting of two components: Sub-sampling and 'robust' fitness function design. In particular, recent work in the wider machine learning literature has recognized that maintaining the original distribution of exemplars during training is often not appropriate for designing classifiers that are robust to degenerate classifier behavior. To this end we propose a 'Simple Active Learning Heuristic' (SALH) in which a subset of exemplars is sampled with uniform probability under a class balance enforcing rule for fitness evaluation. In addition, an efficient estimator for the Area Under the Curve (AUC) performance metric is assumed in the form of a modified Wilcoxon-Mann-Whitney (WMW) statistic. Performance is evaluated in terms of six representative UCI data sets and benchmarked against: canonical GP, SALH based GP, SALH and the modified WMW statistic, and deterministic classifiers (Naive Bayes and C4.5). The resulting SALH-WMW model is demonstrated to be both efficient and effective at providing solutions maximizing performance assessed in terms of AUC.