Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Corpus-based induction of syntactic structure: models of dependency and constituency
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Learning accurate, compact, and interpretable tree annotation
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
High-accuracy annotation and parsing of CHILDES transcripts
CACLA '07 Proceedings of the Workshop on Cognitive Aspects of Computational Language Acquisition
Sparsity in dependency grammar induction
ACLShort '10 Proceedings of the ACL 2010 Conference Short Papers
Posterior Regularization for Structured Latent Variable Models
The Journal of Machine Learning Research
The PASCAL Challenge on Grammar Induction
WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
Hi-index | 0.00 |
In this paper we describe our participating system for the dependency induction track of the PASCAL Challenge on Grammar Induction. Our system incorporates two types of inductive biases: the sparsity bias and the unambiguity bias. The sparsity bias favors a grammar with fewer grammar rules. The unambiguity bias favors a grammar that leads to unambiguous parses, which is motivated by the observation that natural language is remarkably unambiguous in the sense that the number of plausible parses of a natural language sentence is very small. We introduce our approach to combining these two types of biases and discuss the system implementation. Our experiments show that both types of inductive biases are beneficial to grammar induction.