Towards High Speed Grammar Induction on Large Text Corpora
SOFSEM '00 Proceedings of the 27th Conference on Current Trends in Theory and Practice of Informatics
Bayesian grammar induction for language modeling
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Biclustering Algorithms for Biological Data Analysis: A Survey
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Unsupervised induction of stochastic context-free grammars using distributional clustering
ConLL '01 Proceedings of the 2001 workshop on Computational Natural Language Learning - Volume 7
Corpus-based induction of syntactic structure: models of dependency and constituency
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
An all-subtrees approach to unsupervised parsing
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Variational bayesian grammar induction for natural language
ICGI'06 Proceedings of the 8th international conference on Grammatical Inference: algorithms and applications
Unsupervised grammar inference using the minimum description length principle
MLDM'12 Proceedings of the 8th international conference on Machine Learning and Data Mining in Pattern Recognition
Learning and parsing video events with goal and intent prediction
Computer Vision and Image Understanding
Hi-index | 0.00 |
This paper presents PCFG-BCL, an unsupervised algorithm that learns a probabilistic context-free grammar (PCFG) from positive samples. The algorithm acquires rules of an unknown PCFG through iterative biclustering of bigrams in the training corpus. Our analysis shows that this procedure uses a greedy approach to adding rules such that each set of rules that is added to the grammar results in the largest increase in the posterior of the grammar given the training corpus. Results of our experiments on several benchmark datasets show that PCFG-BCL is competitive with existing methods for unsupervised CFG learning.