Automatic selection of high quality parses created by a fully unsupervised parser

Authors:
Roi Reichart;Ari Rappoport
Affiliations:
The Hebrew University;The Hebrew University
Venue:
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Year:
2009

Citing 23
Cited 3

Scaling question answering to the Web

Proceedings of the 10th international conference on World Wide Web
Head-driven statistical models for natural language parsing

Head-driven statistical models for natural language parsing
Combining distributional and morphological information for part of speech induction

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
A syntax-based statistical translation model

ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
A generative constituent-context model for improved grammar induction

ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
In question answering, two heads are better than one

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
COGEX: a logic prover for question answering

NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
The unsupervised learning of natural language structure

The unsupervised learning of natural language structure
An empirical comparison of supervised learning algorithms

ICML '06 Proceedings of the 23rd international conference on Machine learning
Corpus-based induction of syntactic structure: models of dependency and constituency

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Annealing structural bias in multilingual weighted grammar induction

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
An all-subtrees approach to unsupervised parsing

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Prototype-driven grammar induction

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Word-Level Confidence Estimation for Machine Translation

Computational Linguistics
The importance of syntactic parsing and inference in semantic role labeling

Computational Linguistics
Self-training for biomedical parsing

HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
Computing confidence scores for all sub parse trees

HLT-Short '08 Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics on Human Language Technologies: Short Papers
Unsupervised parsing with U-DOP

CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
Learning reliable information for dependency parsing adaptation

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Unsupervised induction of labeled parse trees by clustering with syntactic features

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Detecting parser errors using web-based semantic filters

EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Automatic prediction of parser accuracy

EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Confidence estimation for information extraction

HLT-NAACL-Short '04 Proceedings of HLT-NAACL 2004: Short Papers

Improved fully unsupervised parsing with zoomed learning

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Posterior Sparsity in Unsupervised Dependency Parsing

The Journal of Machine Learning Research
ULISSE: an unsupervised algorithm for detecting reliable dependency parses

CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

The average results obtained by unsupervised statistical parsers have greatly improved in the last few years, but on many specific sentences they are of rather low quality. The output of such parsers is becoming valuable for various applications, and it is radically less expensive to create than manually annotated training data. Hence, automatic selection of high quality parses created by unsupervised parsers is an important problem. In this paper we present PUPA, a POS-based Unsupervised Parse Assessment algorithm. The algorithm assesses the quality of a parse tree using POS sequence statistics collected from a batch of parsed sentences. We evaluate the algorithm by using an unsupervised POS tagger and an unsupervised parser, selecting high quality parsed sentences from English (WSJ) and German (NEGRA) corpora. We show that PUPA outperforms the leading previous parse assessment algorithm for supervised parsers, as well as a strong unsupervised baseline. Consequently, PUPA allows obtaining high quality parses without any human involvement.