Foundations of statistical natural language processing
Foundations of statistical natural language processing
Two Experiments on Learning Probabilistic Dependency Grammars from Corpora
Two Experiments on Learning Probabilistic Dependency Grammars from Corpora
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Inside-outside reestimation from partially bracketed corpora
ACL '92 Proceedings of the 30th annual meeting on Association for Computational Linguistics
A generative constituent-context model for improved grammar induction
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
The unsupervised learning of natural language structure
The unsupervised learning of natural language structure
Inducing syntactic categories by context distribution clustering
ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Unsupervised induction of stochastic context-free grammars using distributional clustering
ConLL '01 Proceedings of the 2001 workshop on Computational Natural Language Learning - Volume 7
Corpus-based induction of syntactic structure: models of dependency and constituency
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Automatic selection of high quality parses created by a fully unsupervised parser
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Unsupervised induction of labeled parse trees by clustering with syntactic features
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Language ID in the context of harvesting language data off the web
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Semi-supervised learning of dependency parsers using generalized expectation criteria
ACL '09 Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 1 - Volume 1
Generalized Expectation Criteria for Semi-Supervised Learning with Weakly Labeled Data
The Journal of Machine Learning Research
Wordica: Emergence of linguistic representations for words by independent component analysis
Natural Language Engineering
What's with the attitude?: identifying sentences with attitude in online discussions
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Parser evaluation over local and non-local deep dependencies in a large corpus
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Bootstrapping via graph propagation
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Syntactic transfer using a bilingual lexicon
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Semi-supervised constituent grammar induction based on text chunking information
CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Smoothing for bracketing induction
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Bayesian Constituent Context Model for Grammar Induction
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Hi-index | 0.00 |
We investigate prototype-driven learning for primarily unsupervised grammar induction. Prior knowledge is specified declaratively, by providing a few canonical examples of each target phrase type. This sparse prototype information is then propagated across a corpus using distributional similarity features, which augment an otherwise standard PCFG model. We show that distributional features are effective at distinguishing bracket labels, but not determining bracket locations. To improve the quality of the induced trees, we combine our PCFG induction with the CCM model of Klein and Manning (2002), which has complementary stengths: it identifies brackets but does not label them. Using only a handful of prototypes, we show substantial improvements over naive PCFG induction for English and Chinese grammar induction.