K-best combination of syntactic parsers

Authors:
Hui Zhang;Min Zhang;Chew Lim Tan;Haizhou Li
Affiliations:
Institute for Infocomm Research and National University of Singapore;Institute for Infocomm Research;National University of Singapore;Institute for Infocomm Research
Venue:
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
Year:
2009

Citing 19
Cited 13

Discriminative Reranking for Natural Language Parsing

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Head-driven statistical models for natural language parsing

Head-driven statistical models for natural language parsing
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
A maximum-entropy-inspired parser

NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
Three generative, lexicalised models for statistical parsing

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
An efficient implementation of a new DOP model

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Parsing the wall street journal using a Lexical-Functional Grammar and discriminative estimation techniques

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Accurate unlexicalized parsing

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
On the parameter space of generative lexicalized statistical parsing models

On the parameter space of generative lexicalized statistical parsing models
Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms

EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing - Volume 10
Probabilistic CFG with latent annotations

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Coarse-to-fine n-best parsing and MaxEnt discriminative reranking

ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Learning accurate, compact, and interpretable tree annotation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Effective self-training for parsing

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Two languages are better than one (for syntactic parsing)

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Parser combination by reparsing

NAACL-Short '06 Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers
Better k-best parsing

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Improving parsing accuracy by combining diverse dependency parsers

Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Statistical parsing with a context-free grammar and word statistics

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence

Products of random latent variable grammars

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Reranking the Berkeley and brown parsers

HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Self-training with products of latent variable grammars

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Heterogeneous parsing via collaborative decoding

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Automatic treebank conversion via informed decoding

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Event extraction as dependency parsing

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Automatic Treebank Conversion via Informed Decoding - A Case Study on Chinese Treebanks

ACM Transactions on Asian Language Information Processing (TALIP)
Bayesian symbol-refined tree substitution grammars for syntactic parsing

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Higher-order constituent parsing and parser combination

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Finite-state chart constraints for reduced complexity context-free parsing pipelines

Computational Linguistics
Semi-supervised constituent grammar induction based on text chunking information

CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Combine constituent and dependency parsing via reranking

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Statistical parsing with probabilistic symbol-refined tree substitution grammars

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose a linear model-based general framework to combine k-best parse outputs from multiple parsers. The proposed framework leverages on the strengths of previous system combination and re-ranking techniques in parsing by integrating them into a linear model. As a result, it is able to fully utilize both the logarithm of the probability of each k-best parse tree from each individual parser and any additional useful features. For feature weight tuning, we compare the simulated-annealing algorithm and the perceptron algorithm. Our experiments are carried out on both the Chinese and English Penn Treebank syntactic parsing task by combining two state-of-the-art parsing models, a head-driven lexicalized model and a latent-annotation-based un-lexicalized model. Experimental results show that our F-Scores of 85.45 on Chinese and 92.62 on English outperform the previously best-reported systems by 1.21 and 0.52, respectively.