Domain adaptation of a dependency parser with a class-class selectional preference model

  • Authors:
  • Raphael Cohen;Yoav Goldberg;Michael Elhadad

  • Affiliations:
  • Ben Gurion University of the Negev, Israel;Ben Gurion University of the Negev, Israel;Ben Gurion University of the Negev, Israel

  • Venue:
  • ACL '12 Proceedings of ACL 2012 Student Research Workshop
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

When porting parsers to a new domain, many of the errors are related to wrong attachment of out-of-vocabulary words. Since there is no available annotated data to learn the attachment preferences of the target domain words, we attack this problem using a model of selectional preferences based on domain-specific word classes. Our method uses Latent Dirichlet Allocations (LDA) to learn a domain-specific Selectional Preference model in the target domain using un-annotated data. The model provides features that model the affinities among pairs of words in the domain. To incorporate these new features in the parsing model, we adopt the co-training approach and retrain the parser with the selectional preferences features. We apply this method for adapting Easy First, a fast non-directional parser trained on WSJ, to the biomedical domain (Genia Treebank). The Selectional Preference features reduce error by 4.5% over the co-training baseline.