The PASCAL Challenge on Grammar Induction

Authors:
Douwe Gelling;Trevor Cohn;Phil Blunsom;João Graça
Affiliations:
University of Sheffield, UK;University of Sheffield, UK;University of Oxford, UK;INESC ID Lisboa, Portugal
Venue:
WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
Year:
2012

Citing 21
Cited 0

Class-based n-gram models of natural language

Computational Linguistics
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
An efficient method for determining bilingual word classes

EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Corpus-based induction of syntactic structure: models of dependency and constituency

ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Prototype-driven learning for sequence models

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
CoNLL-X shared task on multilingual dependency parsing

CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
High-accuracy annotation and parsing of CHILDES transcripts

CACLA '07 Proceedings of the Workshop on Cognitive Aspects of Computational Language Acquisition
Two decades of unsupervised POS induction: how far have we come?

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Unsupervised induction of tree substitution grammars for dependency parsing

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Using universal linguistic knowledge to guide grammar induction

EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Neutralizing linguistically problematic annotations in unsupervised dependency parsing evaluation

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
A hierarchical Pitman-Yor process HMM for unsupervised part of speech induction

HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Controlling complexity in part-of-speech induction

Journal of Artificial Intelligence Research
Evaluating dependency parsing: robust and heuristics-free cross-nnotation evaluation

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A Bayesian mixture model for part-of-speech induction using multiple features

EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Two baselines for unsupervised dependency parsing

WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
Unsupervised dependency parsing using reducibility and fertility features

WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
Induction of linguistic structure with combinatory categorial grammars

WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
Turning the pipeline into a loop: iterated unsupervised dependency parsing and PoS induction

WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
Hierarchical clustering of word class distributions

WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
Combining the sparsity and unambiguity biases for grammar induction

WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents the results of the PASCAL Challenge on Grammar Induction, a competition in which competitors sought to predict part-of-speech and dependency syntax from text. Although many previous competitions have featured dependency grammars or parts-of-speech, these were invariably framed as supervised learning and/or domain adaption. This is the first challenge to evaluate unsupervised induction systems, a sub-field of syntax which is rapidly becoming very popular. Our challenge made use of a 10 different treebanks annotated in a range of different linguistic formalisms and covering 9 languages. We provide an overview of the approaches taken by the participants, and evaluate their results on each dataset using a range of different evaluation metrics.