Symbiotic coevolutionary genetic programming: a benchmarking study under large attribute spaces

  • Authors:
  • John A. Doucette;Andrew R. Mcintyre;Peter Lichodzijewski;Malcolm I. Heywood

  • Affiliations:
  • David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, Canada;Faculty of Computer Science, Dalhousie University, Halifax, Canada;Faculty of Computer Science, Dalhousie University, Halifax, Canada;Faculty of Computer Science, Dalhousie University, Halifax, Canada

  • Venue:
  • Genetic Programming and Evolvable Machines
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Classification under large attribute spaces represents a dual learning problem in which attribute subspaces need to be identified at the same time as the classifier design is established. Embedded as opposed to filter or wrapper methodologies address both tasks simultaneously. The motivation for this work stems from the observation that team based approaches to Genetic Programming (GP) have the potential to design multiple classifiers per class--each with a potentially unique attribute subspace--without recourse to filter or wrapper style preprocessing steps. Specifically, competitive coevolution provides the basis for scaling the algorithm to data sets with large instance counts; whereas cooperative coevolution provides a framework for problem decomposition under a bid-based model for establishing program context. Symbiosis is used to separate the tasks of team/ensemble composition from the design of specific team members. Team composition is specified in terms of a combinatorial search performed by a Genetic Algorithm (GA); whereas the properties of individual team members and therefore subspace identification is established under an independent GP population. Teaming implies that the members of the resulting ensemble of classifiers should have explicitly non-overlapping behaviour. Performance evaluation is conducted over data sets taken from the UCI repository with 649---102,660 attributes and 2---10 classes. The resulting teams identify attribute spaces 1---4 orders of magnitude smaller than under the original data set. Moreover, team members generally consist of less than 10 instructions; thus, small attribute subspaces are not being traded for opaque models.