Squibs and discussions: the DOP Estimation method is biased and inconsistent
Computational Linguistics
Data-Oriented Parsing
An annotation scheme for free word order languages
ANLC '97 Proceedings of the fifth conference on Applied natural language processing
Distributional part-of-speech tagging
EACL '95 Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Computational complexity of probabilistic disambiguation by means of tree-grammars
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
An efficient implementation of a new DOP model
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
Building a large-scale annotated Chinese corpus
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
A generative constituent-context model for improved grammar induction
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
An improved parser for data-oriented lexical-functional analysis
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
The unsupervised learning of natural language structure
The unsupervised learning of natural language structure
Inducing syntactic categories by context distribution clustering
ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Unsupervised induction of stochastic context-free grammars using distributional clustering
ConLL '01 Proceedings of the 2001 workshop on Computational Natural Language Learning - Volume 7
Corpus-based induction of syntactic structure: models of dependency and constituency
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Coarse-to-fine n-best parsing and MaxEnt discriminative reranking
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Theoretical evaluation of estimation methods for data-oriented parsing
EACL '06 Proceedings of the Eleventh Conference of the European Chapter of the Association for Computational Linguistics: Posters & Demonstrations
A unified model of structural organization in language and music
Journal of Artificial Intelligence Research
Parsing '05 Proceedings of the Ninth International Workshop on Parsing Technology
Natural language grammar induction with a generative constituent-context model
Pattern Recognition
An all-subtrees approach to unsupervised parsing
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Automatic selection of high quality parses created by a fully unsupervised parser
CoNLL '09 Proceedings of the Thirteenth Conference on Computational Natural Language Learning
Unsupervised induction of labeled parse trees by clustering with syntactic features
COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Upper bounds for unsupervised parsing with unambiguous non-terminally separated grammars
CLAGI '09 Proceedings of the EACL 2009 Workshop on Computational Linguistic Aspects of Grammatical Inference
Improved fully unsupervised parsing with zoomed learning
EMNLP '10 Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing
Bounding the maximal parsing performance of non-terminally separated grammars
ICGI'10 Proceedings of the 10th international colloquium conference on Grammatical inference: theoretical results and applications
Formal and empirical grammatical inference
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts of ACL 2011
Simple unsupervised grammar induction from raw text with cascaded finite state models
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1
Reducing the size of the representation for the uDOP-estimate
EMNLP '11 Proceedings of the First Workshop on Unsupervised Learning in NLP
Computational models of language acquisition
CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
A feature-rich constituent context model for grammar induction
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Semi-supervised constituent grammar induction based on text chunking information
CICLing'13 Proceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume Part I
Hi-index | 0.02 |
We propose a generalization of the supervised DOP model to unsupervised learning. This new model, which we call U-DOP, initially assigns all possible unlabeled binary trees to a set of sentences and next uses all subtrees from (a large subset of) these binary trees to compute the most probable parse trees. We show how U-DOP can be implemented by a PCFG-reduction technique and report competitive results on English (WSJ), German (NEGRA) and Chinese (CTB) data. To the best of our knowledge, this is the first paper which accurately bootstraps structure for Wall Street Journal sentences up to 40 words obtaining roughly the same accuracy as a binarized supervised PCFG. We show that previous approaches to unsupervised parsing have shortcomings in that they either constrain the lexical or the structural context, or both.