Simple unsupervised grammar induction from raw text with cascaded finite state models
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Unsupervised syntactic chunking with acoustic cues: computational models for prosodic bootstrapping
CMCL '11 Proceedings of the 2nd Workshop on Cognitive Modeling and Computational Linguistics
Capitalization cues improve dependency grammar induction
WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
Three dependency-and-boundary models for grammar induction
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Hi-index | 0.00 |
We present an approach to unsupervised partial parsing: the identification of low-level constituents (which we dub clumps) in unannotated text. We begin by showing that CCLParser (Seginer 2007), an unsupervised parsing model, is particularly adept at identifying clumps, and that, surprisingly, building a simple right-branching structure above its clumps actually outperforms the full parser itself, indicating that much of the CCLParser's performance comes from good local predictions. Based on this observation, we define a simple bigram model that is competitive with CCLParser for clumping which further illustrates how important this level of representation is for unsupervised parsing.