Part-of-speech tagging using virtual evidence and negative training

  • Authors:
  • Sheila M. Reynolds;Jeff A. Bilmes

  • Affiliations:
  • University of Washington, Seattle, WA;University of Washington, Seattle, WA

  • Venue:
  • HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a part-of-speech tagger which introduces two new concepts: virtual evidence in the form of an "observed child" node, and negative training data to learn the conditional probabilities for the observed child. Associated with each word is a flexible feature-set which can include binary flags, neighboring words, etc. The conditional probability of Tag given Word + Features is implemented using a factored language-model with back-off to avoid data sparsity problems. This model remains within the framework of Dynamic Bayesian Networks (DBNs) and is conditionally-structured, but resolves the label bias problem inherent in the conditional Markov model (CMM).