A log-linear model with an n-gram reference distribution for accurate HPSG parsing

Authors:
Takashi Ninomiya;Takuya Matsuzaki;Yusuke Miyao;Jun'ichi Tsujii
Affiliations:
University of Tokyo;University of Tokyo;University of Tokyo;University of Tokyo, University of Manchester, NaCTeM (National Center for Text Mining), Bunkyo-ku, Tokyo, Japan
Venue:
IWPT '07 Proceedings of the 10th International Conference on Parsing Technologies
Year:
2007

Citing 26
Cited 6

A maximum entropy approach to natural language processing

Computational Linguistics
Statistical methods for speech recognition

Statistical methods for speech recognition
The syntactic process

The syntactic process
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Stochastic attribute-value grammars

Computational Linguistics
Supertagging: an approach to almost parsing

Computational Linguistics
Efficient feature structure operations without compilation

Natural Language Engineering
Exploiting auxiliary distributions in stochastic unification-based grammars

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Statistical parsing and language modeling based on constraint dependency grammar

Statistical parsing and language modeling based on constraint dependency grammar
Estimators for stochastic "Unification-Based" grammars

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Dynamic programming for parsing and estimation of stochastic unification-based grammars

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Parsing with generative models of predicate-argument structure

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Lexicalized stochastic modeling of constraint-based grammars using log-linear measures and EM training

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
A comparison of algorithms for maximum entropy parameter estimation

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Parsing the WSJ using CCG and log-linear models

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Probabilistic disambiguation models for wide-coverage HPSG parsing

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Guiding a constraint dependency parser with supertags

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Hybrid parsing: using probabilistic models as predictors for a symbolic parser

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
The importance of supertagging for wide-coverage CCG parsing

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Bidirectional inference with the easiest-first strategy for tagging sequence data

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Maximum entropy estimation for feature forests

HLT '02 Proceedings of the second international conference on Human Language Technology Research
Extremely lexicalized models for accurate and fast HPSG parsing

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
A statistical constraint dependency grammar (CDG) parser

IncrementParsing '04 Proceedings of the Workshop on Incremental Parsing: Bringing Engineering and Cognition Together
Efficient HPSG parsing with supertagging and CFG-filtering

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Efficacy of beam thresholding, unification filtering and hybrid parsing in probabilistic HPSG parsing

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Corpus-Oriented grammar development for acquiring a head-driven phrase structure grammar from the penn treebank

IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing

Comparative parser performance analysis across grammar frameworks through automatic tree conversion using synchronous grammars

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Deterministic shift-reduce parsing for unification-based grammars by using default unification

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Forest-guided supertagger training

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Parsing natural language queries for life science knowledge

BioNLP '11 Proceedings of BioNLP 2011 Workshop
A collaborative annotation between human annotators and a statistical parser

LAW V '11 Proceedings of the 5th Linguistic Annotation Workshop
Design and implementation of GXP make - A workflow system based on make

Future Generation Computer Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper describes a log-linear model with an n-gram reference distribution for accurate probabilistic HPSG parsing. In the model, the n-gram reference distribution is simply defined as the product of the probabilities of selecting lexical entries, which are provided by the discriminative method with machine learning features of word and POS n-gram as defined in the CCG/HPSG/CDG supertagging. Recently, supertagging becomes well known to drastically improve the parsing accuracy and speed, but supertagging techniques were heuristically introduced, and hence the probabilistic models for parse trees were not well defined. We introduce the supertagging probabilities as a reference distribution for the log-linear model of the probabilistic HPSG. This is the first model which properly incorporates the supertagging probabilities into parse tree's probabilistic model.