Inducing Probabilistic Grammars by Bayesian Model Merging
ICGI '94 Proceedings of the Second International Colloquium on Grammatical Inference and Applications
Two Experiments on Learning Probabilistic Dependency Grammars from Corpora
Two Experiments on Learning Probabilistic Dependency Grammars from Corpora
Bayesian grammar induction for language modeling
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
A generative constituent-context model for improved grammar induction
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Automatically acquiring phrase structure using distributional analysis
HLT '91 Proceedings of the workshop on Speech and Natural Language
The unsupervised learning of natural language structure
The unsupervised learning of natural language structure
Inducing syntactic categories by context distribution clustering
ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Hi-index | 0.00 |
Grammar induction is one of attractive research areas of natural language processing. Since both supervised and to some extent semi-supervised grammar induction methods require large treebanks, and for many languages, such treebanks do not currently exist, we focused our attention on unsupervised approaches. Constituent Context Model (CCM) seems to be the state of the art in unsupervised grammar induction. In this paper, we show that the performance of CCM in free word order languages (FWOLs) such as Persian is inferior to that of fixed order languages such as English. We also introduce a novel approach, called parent-based constituent context model (PCCM), and show that by using some history notion of context and constituent information of each span's parent, the performance of CCM, especially in dealing with FWOLs, can be significantly improved.