On the computational complexity of approximating distributions by probabilistic automata
COLT '90 Proceedings of the third annual workshop on Computational learning theory
Procedure for quantitatively comparing the syntactic coverage of English grammars
HLT '91 Proceedings of the workshop on Speech and Natural Language
On the learnability of discrete distributions
STOC '94 Proceedings of the twenty-sixth annual ACM symposium on Theory of computing
Natural Language Processing in LISP: An Introduction to Computational Linguistics
Natural Language Processing in LISP: An Introduction to Computational Linguistics
ICGI '98 Proceedings of the 4th International Colloquium on Grammatical Inference
Inducing Probabilistic Grammars by Bayesian Model Merging
ICGI '94 Proceedings of the Second International Colloquium on Grammatical Inference and Applications
Identification of DFA: data-dependent vs data-independent algorithms
ICG! '96 Proceedings of the 3rd International Colloquium on Grammatical Inference: Learning Syntax from Sentences
A study of grammatical inference
A study of grammatical inference
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
PCFG models of linguistic tree representations
Computational Linguistics
An annotation scheme for free word order languages
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Distributional part-of-speech tagging
EACL '95 Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics
Three generative, lexicalised models for statistical parsing
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
A model of early syntactic development
ACL '82 Proceedings of the 20th annual meeting on Association for Computational Linguistics
Automatic grammar induction and parsing free text: a transformation-based approach
ACL '93 Proceedings of the 31st annual meeting on Association for Computational Linguistics
Bayesian grammar induction for language modeling
ACL '95 Proceedings of the 33rd annual meeting on Association for Computational Linguistics
Inside-outside reestimation from partially bracketed corpora
ACL '92 Proceedings of the 30th annual meeting on Association for Computational Linguistics
A production system model of first language acquisition
COLING '80 Proceedings of the 8th conference on Computational linguistics
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Building a large-scale annotated Chinese corpus
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Inducing syntactic categories by context distribution clustering
ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Unsupervised induction of stochastic context-free grammars using distributional clustering
ConLL '01 Proceedings of the 2001 workshop on Computational Natural Language Learning - Volume 7
Distributional phrase structure induction
ConLL '01 Proceedings of the 2001 workshop on Computational Natural Language Learning - Volume 7
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
A generative constituent-context model for improved grammar induction
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
An all-subtrees approach to unsupervised parsing
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
AN UNSUPERVISED INCREMENTAL LEARNING ALGORITHM FOR DOMAIN-SPECIFIC LANGUAGE DEVELOPMENT
Applied Artificial Intelligence
Limitations of current grammar induction algorithms
ACL '07 Proceedings of the 45th Annual Meeting of the ACL: Student Research Workshop
Unsupervised parsing with U-DOP
CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
A linguistic investigation into unsupervised DOP
CACLA '07 Proceedings of the Workshop on Cognitive Aspects of Computational Language Acquisition
Identifying patterns for unsupervised grammar induction
CoNLL '10 Proceedings of the Fourteenth Conference on Computational Natural Language Learning
Finite state grammar transduction from distributed collected knowledge
CICLing'06 Proceedings of the 7th international conference on Computational Linguistics and Intelligent Text Processing
Computational models of language acquisition
CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
Empiricist solutions to nativist puzzles by means of unsupervised TSG
Proceedings of the Workshop on Computational Models of Language Acquisition and Loss
Semi-supervised constituent grammar induction based on text chunking information
CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Hi-index | 0.01 |
We present a generative probabilistic model for the unsupervised learning of hierarchical natural language syntactic structure. Unlike most previous work, we do not learn a context-free grammar, but rather induce a distributional model of constituents which explicitly relates constituent yields and their linear contexts. Parameter search with EM produces higher quality analyses for human language data than those previously exhibited by unsupervised systems, giving the best published unsupervised parsing results on the ATIS corpus. Experiments on Penn treebank sentences of comparable length show an even higher constituent F"1 of 71% on non-trivial brackets. We compare distributionally induced and actual part-of-speech tags as input data, and examine extensions to the basic model. We discuss errors made by the system, compare the system to previous models, and discuss upper bounds, lower bounds, and stability for this task.