Deducing linguistic structure from the statistics of large corpora
HLT '90 Proceedings of the workshop on Speech and Natural Language
Discovering the lexical features of a language
ACL '91 Proceedings of the 29th annual meeting on Association for Computational Linguistics
A stochastic approach to parsing
COLING '86 Proceedings of the 11th coference on Computational linguistics
Inside-outside reestimation from partially bracketed corpora
HLT '91 Proceedings of the workshop on Speech and Natural Language
From grammar to lexicon: unsupervised learning of lexical syntax
Computational Linguistics - Special issue on using large corpora: II
Automatic grammar induction and parsing free text: a transformation-based approach
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Automatic grammar induction and parsing free text: a transformation-based approach
HLT '93 Proceedings of the workshop on Human Language Technology
The Penn Treebank: annotating predicate argument structure
HLT '94 Proceedings of the workshop on Human Language Technology
Grammar induction by MDL-based distributional classification
New developments in parsing technology
Annealing structural bias in multilingual weighted grammar induction
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Unsupervised Grammar Induction Using a Parent Based Constituent Context Model
Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
Hi-index | 0.00 |
In this paper, we present evidence that the acquisition of the phrase structure of a natural language is possible without supervision and with a very small initial grammar. We describe a language learner that extracts distributional information from a corpus annotated with parts of speech and is able to use this extracted information to accurately parse short sentences. The phrase structure learner is part of an ongoing project to determine just how much knowledge of language can be learned solely through distributional analysis.