Accurate methods for the statistics of surprise and coincidence
Computational Linguistics - Special issue on using large corpora: I
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
A generative constituent-context model for improved grammar induction
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
The unsupervised learning of natural language structure
The unsupervised learning of natural language structure
Corpus-based induction of syntactic structure: models of dependency and constituency
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
An all-subtrees approach to unsupervised parsing
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Unsupervised part-of-speech tagging employing efficient graph clustering
COLING ACL '06 Proceedings of the 21st International Conference on computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop
Simple unsupervised grammar induction from raw text with cascaded finite state models
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Reducing the size of the representation for the uDOP-estimate
EMNLP '11 Proceedings of the First Workshop on Unsupervised Learning in NLP
Three dependency-and-boundary models for grammar induction
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Smoothing for bracketing induction
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Bayesian Constituent Context Model for Grammar Induction
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Hi-index | 0.00 |
This paper presents an algorithm for unsupervised co-occurrence based parsing that improves and extends existing approaches. The proposed algorithm induces a context-free grammar of the language in question in an iterative manner. The resulting structure of a sentence will be given as a hierarchical arrangement of constituents. Although this algorithm does not use any a priori knowledge about the language, it is able to detect heads, modifiers and a phrase type's different compound composition possibilities. For evaluation purposes, the algorithm is applied to manually annotated part-of-speech tags (POS tags) as well as to word classes induced by an unsupervised part-of-speech tagger.