Class-based n-gram models of natural language
Computational Linguistics
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
An efficient method for determining bilingual word classes
EACL '99 Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics
Corpus-based induction of syntactic structure: models of dependency and constituency
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Prototype-driven learning for sequence models
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
CoNLL-X shared task on multilingual dependency parsing
CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
High-accuracy annotation and parsing of CHILDES transcripts
CACLA '07 Proceedings of the Workshop on Cognitive Aspects of Computational Language Acquisition
Two decades of unsupervised POS induction: how far have we come?
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Unsupervised induction of tree substitution grammars for dependency parsing
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Using universal linguistic knowledge to guide grammar induction
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Neutralizing linguistically problematic annotations in unsupervised dependency parsing evaluation
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
A hierarchical Pitman-Yor process HMM for unsupervised part of speech induction
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Controlling complexity in part-of-speech induction
Journal of Artificial Intelligence Research
Evaluating dependency parsing: robust and heuristics-free cross-nnotation evaluation
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A Bayesian mixture model for part-of-speech induction using multiple features
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Two baselines for unsupervised dependency parsing
WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
Unsupervised dependency parsing using reducibility and fertility features
WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
Induction of linguistic structure with combinatory categorial grammars
WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
Turning the pipeline into a loop: iterated unsupervised dependency parsing and PoS induction
WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
Hierarchical clustering of word class distributions
WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
Combining the sparsity and unambiguity biases for grammar induction
WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
Hi-index | 0.00 |
This paper presents the results of the PASCAL Challenge on Grammar Induction, a competition in which competitors sought to predict part-of-speech and dependency syntax from text. Although many previous competitions have featured dependency grammars or parts-of-speech, these were invariably framed as supervised learning and/or domain adaption. This is the first challenge to evaluate unsupervised induction systems, a sub-field of syntax which is rapidly becoming very popular. Our challenge made use of a 10 different treebanks annotated in a range of different linguistic formalisms and covering 9 languages. We provide an overview of the approaches taken by the participants, and evaluate their results on each dataset using a range of different evaluation metrics.