Structural ambiguity and lexical relations
Computational Linguistics - Special issue on using large corpora: I
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Statistical models for unsupervised prepositional phrase attachment
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 2
Combining unsupervised and supervised methods for PP attachment disambiguation
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
A maximum entropy model for prepositional phrase attachment
HLT '94 Proceedings of the workshop on Human Language Technology
Hi-index | 0.00 |
The correct attachment of prepositional phrases (PPs) is a central disambiguation problem in parsing natural languages. This paper compares the baseline situation in English, German and Swedish based on manual PP attachments in various treebanks for these languages. We argue that cross-language comparisons of the disambiguation results in previous research is impossible because of the different selection procedures when building the training and test sets. We perform uniform tree-bank queries and show that English has the highest noun attachment rate followed by Swedish and German. We also show that the high rate in English is dominated by the preposition of. From our study we derive a list of criteria for profiling data sets for PP attachment experiments.