Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations
The ATIS spoken language systems pilot corpus
HLT '90 Proceedings of the workshop on Speech and Natural Language
Data-Oriented Parsing
Generalized probabilistic LR parsing of natural language (Corpora) with unification-based grammars
Computational Linguistics - Special issue on using large corpora: I
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Using an annotated corpus as a stochastic grammar
EACL '93 Proceedings of the sixth conference on European chapter of the Association for Computational Linguistics
Efficiency, robustness and accuracy in Picky chart parsing
ACL '92 Proceedings of the 30th annual meeting on Association for Computational Linguistics
Stochastic lexicalized tree-adjoining grammars
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 2
A computational model of language performance: Data Oriented Parsing
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 3
Computational complexity of probabilistic disambiguation by means of tree-grammars
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
EMNLP '06 Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing
Inducing Tree-Substitution Grammars
The Journal of Machine Learning Research
Hi-index | 0.00 |
We deal with the question as to whether there exists a polynomial time algorithm for computing the most probable parse tree of a sentence generated by a data-oriented parsing (DOP) model. (Scha, 1990; Bod, 1992, 1993a). Therefore we describe DOP as a stochastic tree-substitution grammar (STSG). In STSG, a tree can be generated by exponentially many derivations involving different elementary trees. The probability of a tree is equal to the sum of the probabilities of all its derivations.We show that in STSG, in contrast with stochastic context-free grammar, the Viterbi algorithm cannot be used for computing a most probable tree of a string. We propose a simple modification of Viterbi which allows by means of a "select-random" search to estimate the most probable tree of a string in polynomial time.Experiments with DOP on ATIS show that only in 68% of the cases, the most probable derivation of a string generates the most probable tree of that string. Therefore, the parse accuracy obtained by the most probable trees (96%) is dramatically higher than the parse accuracy obtained by the most probable derivations (65%).It is still an open question whether the most probable tree of a string can be deterministically computed in polynomial time.