Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
A generative constituent-context model for improved grammar induction
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Head-Driven Statistical Models for Natural Language Parsing
Computational Linguistics
Unsupervised induction of stochastic context-free grammars using distributional clustering
ConLL '01 Proceedings of the 2001 workshop on Computational Natural Language Learning - Volume 7
Parsing the WSJ using CCG and log-linear models
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Corpus-based induction of syntactic structure: models of dependency and constituency
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
An all-subtrees approach to unsupervised parsing
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
A hierarchical Bayesian language model based on Pitman-Yor processes
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Discriminative learning and spanning tree algorithms for dependency parsing
Discriminative learning and spanning tree algorithms for dependency parsing
Shared logistic normal distributions for soft parameter tying in unsupervised grammar induction
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Improving unsupervised dependency parsing with richer contexts and smoothing
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Inducing compact but accurate tree-substitution grammars
NAACL '09 Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Variational inference for adaptor grammars
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
From baby steps to Leapfrog: how "Less is More" in unsupervised dependency parsing
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Profiting from mark-up: hyper-text annotations for guided parsing
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Blocked inference in Bayesian tree substitution grammars
ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Viterbi training improves unsupervised dependency parsing
CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
Inducing Tree-Substitution Grammars
The Journal of Machine Learning Research
Inducing Tree-Substitution Grammars
The Journal of Machine Learning Research
Neutralizing linguistically problematic annotations in unsupervised dependency parsing evaluation
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Punctuation: making a point in unsupervised dependency parsing
CoNLL '11 Proceedings of the Fifteenth Conference on Computational Natural Language Learning
Multi-source transfer of delexicalized dependency parsers
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Quasi-synchronous phrase dependency grammars for machine translation
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Lateen EM: unsupervised training with multiple objectives, applied to dependency grammar induction
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
A new general grammar formalism for parsing
MICAI'11 Proceedings of the 10th Mexican international conference on Advances in Artificial Intelligence - Volume Part I
Concavity and initialization for unsupervised dependency parsing
NAACL HLT '12 Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Fast unsupervised dependency parsing with arc-standard transitions
ROBUS-UNSUP '12 Proceedings of the Joint Workshop on Unsupervised and Semi-Supervised Learning in NLP
The PASCAL Challenge on Grammar Induction
WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
Bayesian symbol-refined tree substitution grammars for syntactic parsing
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Native language detection with tree substitution grammars
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Exploiting reducibility in unsupervised dependency parsing
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Unambiguity regularization for unsupervised learning of probabilistic grammars
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Smoothing for bracketing induction
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Bayesian Constituent Context Model for Grammar Induction
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Hi-index | 0.00 |
Inducing a grammar directly from text is one of the oldest and most challenging tasks in Computational Linguistics. Significant progress has been made for inducing dependency grammars, however the models employed are overly simplistic, particularly in comparison to supervised parsing models. In this paper we present an approach to dependency grammar induction using tree substitution grammar which is capable of learning large dependency fragments and thereby better modelling the text. We define a hierarchical non-parametric Pitman-Yor Process prior which biases towards a small grammar with simple productions. This approach significantly improves the state-of-the-art, when measured by head attachment accuracy.