Unsupervised Learning of Probabilistic Context-Free Grammar using Iterative Biclustering

  • Authors:
  • Kewei Tu;Vasant Honavar

  • Affiliations:
  • Department of Computer Science, Iowa State University, Ames, USA IA 50011;Department of Computer Science, Iowa State University, Ames, USA IA 50011

  • Venue:
  • ICGI '08 Proceedings of the 9th international colloquium on Grammatical Inference: Algorithms and Applications
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents PCFG-BCL, an unsupervised algorithm that learns a probabilistic context-free grammar (PCFG) from positive samples. The algorithm acquires rules of an unknown PCFG through iterative biclustering of bigrams in the training corpus. Our analysis shows that this procedure uses a greedy approach to adding rules such that each set of rules that is added to the grammar results in the largest increase in the posterior of the grammar given the training corpus. Results of our experiments on several benchmark datasets show that PCFG-BCL is competitive with existing methods for unsupervised CFG learning.