The ATIS spoken language systems pilot corpus
HLT '90 Proceedings of the workshop on Speech and Natural Language
Inside-outside reestimation from partially bracketed corpora
ACL '92 Proceedings of the 30th annual meeting on Association for Computational Linguistics
Probabilistic tree-adjoining grammar as a framework for statistical natural language processing
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
Stochastic lexicalized tree-adjoining grammars
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
A computational model of language performance: Data Oriented Parsing
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 3
Natural Language Engineering
Context-sensitive spoken dialogue processing with the DOP model
Natural Language Engineering
Evaluating two methods for Treebank grammar compaction
Natural Language Engineering
Experiments with corpus-based LFG specialization
ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Review of "Statistical language learning" by Eugene Charniak. The MIT Press 1993.
Computational Linguistics
EACL '95 Proceedings of the seventh conference on European chapter of the Association for Computational Linguistics
A DOP model for semantic interpretation
ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
A probabilistic corpus-driven model for lexical-functional analysis
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Parsing algorithms and metrics
ACL '96 Proceedings of the 34th annual meeting on Association for Computational Linguistics
Parsing with the shortest derivation
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 1
Towards a more careful evaluation of broad coverage parsing systems
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 1
CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
Theoretical evaluation of estimation methods for data-oriented parsing
EACL '06 Proceedings of the Eleventh Conference of the European Chapter of the Association for Computational Linguistics: Posters & Demonstrations
Bayesian learning of a tree substitution grammar
ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Simple, accurate parsing with an all-fragments grammar
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Judging grammaticality with tree substitution grammar derivations
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
The surprising variance in shortest-derivation parsing
HLT '11 Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies: short papers - Volume 2
Judging grammaticality with count-induced tree substitution grammars
Proceedings of the Seventh Workshop on Building Educational Applications Using NLP
Toward Tree Substitution Grammars with latent annotations
WILS '12 Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure
Hi-index | 0.00 |
In Data Oriented Parsing (DOP), an annotated corpus is used as a stochastic grammar. An input string is parsed by combining subtrees from the corpus. As a consequence, one parse tree can usually be generated by several derivations that involve different subtrees. This leads to a statistics where the probability of a parse is equal to the sum of the probabilities of all its derivations. In (Scha, 1990) an informal introduction to DOP is given, while (Bod, 1992a) provides a formalization of the theory. In this paper we compare DOP with other stochastic grammars in the context of Formal Language Theory. It it proved that it is not possible to create for every DOP-model a strongly equivalent stochastic CFG which also assigns the same probabilities to the parses. We show that the maximum probability parse can be estimated in polynomial time by applying Monte Carlo techniques. The model was tested on a set of hand-parsed strings from the Air Travel Information System (ATIS) spoken language corpus. Preliminary experiments yield 96% test set parsing accuracy.