Discovering a domain alphabet

  • Authors:
  • Michael D. Schmidt;Hod Lipson

  • Affiliations:
  • Cornell University, Ithaca, NY, USA;Cornell University, Ithaca, NY, USA

  • Venue:
  • Proceedings of the 11th Annual conference on Genetic and evolutionary computation
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

A key to the success of any genetic programming process is the use of a good alphabet of atomic building blocks from which solutions can be evolved efficiently. An alphabet that is too granular may generate an unnecessarily large search space; an inappropriately coarse grained alphabet may bias or prevent finding optimal solutions. Here we introduce a method that automatically identifies a small alphabet for a problem domain. We process solutions on the complexity-optimality Pareto front of a number of sample systems and identify terms that appear significantly more frequently than merited by their size. These terms are then used as basic building blocks to solve new problems in the same problem domain. We demonstrate this process on symbolic regression for a variety of physics problems. The method discovers key terms relating to concepts such as energy and momentum. A significant performance enhancement is demonstrated when these terms are then used as basic building blocks on new physics problems. We suggest that identifying a problem-specific alphabet is key to scaling evolutionary methods to higher complexity systems.