Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Improving data driven wordclass tagging by system combination
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Natural Language Engineering
A unified model of structural organization in language and music
Journal of Artificial Intelligence Research
GRAEL: an agent-based evolutionary computing approach for natural language grammar development
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Evolutionary computing as a tool for grammar development
GECCO'03 Proceedings of the 2003 international conference on Genetic and evolutionary computation: PartI
Hi-index | 0.00 |
Data-Oriented Parsing (Dop) ranks among the best parsing schemes, pairing state-of-the art parsing accuracy to the psycholinguistic insight that larger chunks of syntactic structures are relevant grammatical and probabilistic units. Parsing with the DOP-model, however, seems to involve a lot of CPU cycles and a considerable amount of double work, brought on by the concept of multiple derivations, which is necessary for probabilistic processing, but which is not convincingly related to a proper linguistic backbone. It is however possible to reinterpret the DOP-model as a pattern-matching model, which tries to maximize the size of the substructures that construct the parse, rather than the probability of the parse. By emphasizing this memory-based aspect of the DOP-model, it is possible to do away with multiple derivations, opening up possibilities for efficient Viterbistyle optimizations, while still retaining acceptable parsing accuracy through enhanced context-sensitivity.