A log-linear model with an n-gram reference distribution for accurate HPSG parsing

  • Authors:
  • Takashi Ninomiya;Takuya Matsuzaki;Yusuke Miyao;Jun'ichi Tsujii

  • Affiliations:
  • University of Tokyo;University of Tokyo;University of Tokyo;University of Tokyo, University of Manchester, NaCTeM (National Center for Text Mining), Bunkyo-ku, Tokyo, Japan

  • Venue:
  • IWPT '07 Proceedings of the 10th International Conference on Parsing Technologies
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes a log-linear model with an n-gram reference distribution for accurate probabilistic HPSG parsing. In the model, the n-gram reference distribution is simply defined as the product of the probabilities of selecting lexical entries, which are provided by the discriminative method with machine learning features of word and POS n-gram as defined in the CCG/HPSG/CDG supertagging. Recently, supertagging becomes well known to drastically improve the parsing accuracy and speed, but supertagging techniques were heuristically introduced, and hence the probabilistic models for parse trees were not well defined. We introduce the supertagging probabilities as a reference distribution for the log-linear model of the probabilistic HPSG. This is the first model which properly incorporates the supertagging probabilities into parse tree's probabilistic model.