An introduction to Kolmogorov complexity and its applications
An introduction to Kolmogorov complexity and its applications
Class-based n-gram models of natural language
Computational Linguistics
Inductive Inference: Theory and Methods
ACM Computing Surveys (CSUR)
Inside-outside reestimation from partially bracketed corpora
ACL '92 Proceedings of the 30th annual meeting on Association for Computational Linguistics
Probabilistic tree-adjoining grammar as a framework for statistical natural language processing
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Stochastic lexicalized tree-adjoining grammars
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
A spelling correction program based on a noisy channel model
COLING '90 Proceedings of the 13th conference on Computational linguistics - Volume 2
IEEE Transactions on Knowledge and Data Engineering
In Search of the Horowitz Factor: Interim Report on a Musical Discovery Project
DS '02 Proceedings of the 5th International Conference on Discovery Science
A reestimation algorithm for probabilistic dependency grammars
Natural Language Engineering
Linguistic structure as composition and perturbation
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
A generative constituent-context model for improved grammar induction
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Unsupervised induction of stochastic context-free grammars using distributional clustering
ConLL '01 Proceedings of the 2001 workshop on Computational Natural Language Learning - Volume 7
Corpus-based induction of syntactic structure: models of dependency and constituency
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Unsupervised Learning of Probabilistic Context-Free Grammar using Iterative Biclustering
ICGI '08 Proceedings of the 9th international colloquium on Grammatical Inference: Algorithms and Applications
Ad Hoc Data and the Token Ambiguity Problem
PADL '09 Proceedings of the 11th International Symposium on Practical Aspects of Declarative Languages
Unsupervised Grammar Induction Using a Parent Based Constituent Context Model
Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
Unsupervised induction of labeled parse trees by clustering with syntactic features
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Parsing with soft and hard constraints on dependency length
Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Evolutionary induction of stochastic context free grammars
Pattern Recognition
Natural language grammar induction with a generative constituent-context model
Pattern Recognition
Covariance in Unsupervised Learning of Probabilistic Grammars
The Journal of Machine Learning Research
A demonstration-based approach for designing domain-specific modeling languages
Proceedings of the ACM international conference companion on Object oriented programming systems languages and applications companion
Creating domain-specific modeling languages using by-demonstration technique
Proceedings of the ACM international conference companion on Object oriented programming systems languages and applications companion
Variational bayesian grammar induction for natural language
ICGI'06 Proceedings of the 8th international conference on Grammatical Inference: algorithms and applications
Unambiguity regularization for unsupervised learning of probabilistic grammars
EMNLP-CoNLL '12 Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
Hi-index | 0.00 |
We describe a corpus-based induction algorithm for probabilistic context-free grammars. The algorithm employs a greedy heuristic search within a Bayesian framework, and a post-pass using the Inside-Outside algorithm. We compare the performance of our algorithm to n-gram models and the Inside-Outside algorithm in three language modeling tasks. In two of the tasks, the training data is generated by a probabilistic context-free grammar and in both tasks our algorithm outperforms the other techniques. The third task involves naturally-occurring data, and in this task our algorithm does not perform as well as n-gram models but vastly outperforms the Inside-Outside algorithm.