Unsupervised part of speech inference with particle filters

  • Authors:
  • Gregory Dubbin;Phil Blunsom

  • Affiliations:
  • University of Oxford, Oxford, United Kingdom;University of Oxford, Oxford, United Kingdom

  • Venue:
  • WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

As linguistic models incorporate more subtle nuances of language and its structure, standard inference techniques can fall behind. Often, such models are tightly coupled such that they defy clever dynamic programming tricks. However, Sequential Monte Carlo (SMC) approaches, i.e. particle filters, are well suited to approximating such models, resolving their multi-modal nature at the cost of generating additional samples. We implement two particle filters, which jointly sample either sentences or word types, and incorporate them into a Gibbs sampler for part-of-speech (PoS) inference. We analyze the behavior of the particle filters, and compare them to a block sentence sampler, a local token sampler, and a heuristic sampler, which constrains inference to a single PoS per word type. Our findings show that particle filters can closely approximate a difficult or even intractable sampler quickly. However, we found that high posterior likelihood do not necessarily correspond to better Many-to-One accuracy. The results suggest that the approach has potential and more advanced particle filters are likely to lead to stronger performance.