Evolutionary induction of stochastic context free grammars

  • Authors:
  • Bill Keller;Rudi Lutz

  • Affiliations:
  • Department of Informatics, The University of Sussex, Falmer, Brighton BN1 9QH, UK;Department of Informatics, The University of Sussex, Falmer, Brighton BN1 9QH, UK

  • Venue:
  • Pattern Recognition
  • Year:
  • 2005

Quantified Score

Hi-index 0.01

Visualization

Abstract

This paper describes an evolutionary approach to the problem of inferring stochastic context-free grammars from finite language samples. The approach employs a distributed, steady-state genetic algorithm, with a fitness function incorporating a prior over the space of possible grammars. Our choice of prior is designed to bias learning towards structurally simpler grammars. Solutions to the inference problem are evolved by optimizing the parameters of a covering grammar for a given language sample. Full details are given of our genetic algorithm (GA) and of our fitness function for grammars. We present the results of a number of experiments in learning grammars for a range of formal languages. Finally we compare the grammars induced using the GA-based approach with those found using the inside-outside algorithm. We find that our approach learns grammars that are both compact and fit the corpus data well.