Elements of information theory
Elements of information theory
An efficient probabilistic context-free parsing algorithm that computes prefix probabilities
Computational Linguistics
Bayesian learning of probabilistic language models
Bayesian learning of probabilistic language models
Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
Two Experiments on Learning Probabilistic Dependency Grammars from Corpora
Two Experiments on Learning Probabilistic Dependency Grammars from Corpora
Bayesian grammar induction for language modeling
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Inside-outside reestimation from partially bracketed corpora
ACL '92 Proceedings of the 30th annual meeting on Association for Computational Linguistics
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Inducing syntactic categories by context distribution clustering
ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
A generative constituent-context model for improved grammar induction
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Corpus-based induction of syntactic structure: models of dependency and constituency
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
An all-subtrees approach to unsupervised parsing
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Prototype-driven grammar induction
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Prototype-driven learning for sequence models
HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Unsupervised Learning of Probabilistic Context-Free Grammar using Iterative Biclustering
ICGI '08 Proceedings of the 9th international colloquium on Grammatical Inference: Algorithms and Applications
Unsupervised parsing with U-DOP
CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
Unsupervised grammar induction by distribution and attachment
CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
Towards full automation of lexicon construction
CLS '04 Proceedings of the HLT-NAACL Workshop on Computational Lexical Semantics
Evolutionary induction of stochastic context free grammars
Pattern Recognition
Natural language grammar induction with a generative constituent-context model
Pattern Recognition
Evolutionary computing as a tool for grammar development
GECCO'03 Proceedings of the 2003 international conference on Genetic and evolutionary computation: PartI
Latent-descriptor clustering for unsupervised POS induction
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Unsupervised induction of tree substitution grammars for dependency parsing
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Inducing Tree-Substitution Grammars
The Journal of Machine Learning Research
A survey of grammatical inference methods for natural language learning
Artificial Intelligence Review
A comparative study on chinese word clustering
ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead
Parser evaluation over local and non-local deep dependencies in a large corpus
EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Bayesian Constituent Context Model for Grammar Induction
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Hi-index | 0.00 |
An algorithm is presented for learning a phrase-structure grammar from tagged text. It clusters sequences of tags together based on local distributional information, and selects clusters that satisfy a novel mutual information criterion. This criterion is shown to be related to the entropy of a random variable associated with the tree structures, and it is demonstrated that it selects linguistically plausible constituents. This is incorporated in a Minimum Description Length algorithm. The evaluation of unsupervised models is discussed, and results are presented when the algorithm has been trained on 12 million words of the British National Corpus.