Squibs and discussions: the DOP Estimation method is biased and inconsistent
Computational Linguistics
Data-Oriented Parsing
Parsing inside-out
Head-driven statistical models for natural language parsing
Head-driven statistical models for natural language parsing
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
A computational model of language performance: Data Oriented Parsing
COLING '92 Proceedings of the 14th conference on Computational linguistics - Volume 3
Computational complexity of probabilistic disambiguation by means of tree-grammars
COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2
An efficient implementation of a new DOP model
EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 1
What is the minimal set of fragments that achieves maximal parse accuracy?
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics
Accurate unlexicalized parsing
ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Probabilistic CFG with latent annotations
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Coarse-to-fine n-best parsing and MaxEnt discriminative reranking
ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
Learning accurate, compact, and interpretable tree annotation
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Unsupervised methods for head assignments
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Bayesian learning of a tree substitution grammar
ACLShort '09 Proceedings of the ACL-IJCNLP 2009 Conference Short Papers
Bayesian synchronous tree-substitution grammar induction and its application to sentence compression
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Simple, accurate parsing with an all-fragments grammar
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Statistical parsing with a context-free grammar and word statistics
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Coarse-to-fine natural language processing
Coarse-to-fine natural language processing
Inducing Tree-Substitution Grammars
The Journal of Machine Learning Research
Efficient convolution kernels for dependency and constituent syntactic trees
ECML'06 Proceedings of the 17th European conference on Machine Learning
Discontinuous data-oriented parsing: a mildly context-sensitive all-fragments grammar
SPMRL '11 Proceedings of the Second Workshop on Statistical Parsing of Morphologically Rich Languages
Native language detection with tree substitution grammars
ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
Hi-index | 0.00 |
We present a novel approach to Data-Oriented Parsing (DOP). Like other DOP models, our parser utilizes syntactic fragments of arbitrary size from a treebank to analyze new sentences, but, crucially, it uses only those which are encountered at least twice. This criterion allows us to work with a relatively small but representative set of fragments, which can be employed as the symbolic backbone of several probabilistic generative models. For parsing we define a transform-backtransform approach that allows us to use standard PCFG technology, making our results easily replicable. According to standard Parseval metrics, our best model is on par with many state-of-the-art parsers, while offering some complementary benefits: a simple generative probability model, and an explicit representation of the larger units of grammar.