Computational Linguistics
CoNLL-X shared task on multilingual dependency parsing
CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
Labeled pseudo-projective dependency parsing with support vector machines
CoNLL-X '06 Proceedings of the Tenth Conference on Computational Natural Language Learning
Integrating morphology with multi-word expression processing in Turkish
MWE '04 Proceedings of the Workshop on Multiword Expressions: Integrating Processing
Can recognising multiword expressions improve shallow parsing?
HLT '10 Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
Decreasing lexical data sparsity in statistical syntactic parsing: experiments with named entities
MWE '11 Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World
Combining compound recognition and PCFG-LA parsing with word lattices and conditional random fields
ACM Transactions on Speech and Language Processing (TSLP) - Special issue on multiword expressions: From theory to practice and use, part 2
Hi-index | 0.00 |
In this paper, we investigated the impact of extracting different types of multiword expressions (MWEs) in improving the accuracy of a data-driven dependency parser for a morphologically rich language (Turkish). We showed that in the training stage, the unification of MWEs of a certain type, namely compound verb and noun formations, has a negative effect on parsing accuracy by increasing the lexical sparsity. Our results gave a statistically significant improvement by using a variant of the treebank excluding this MWE type in the training stage. Our extrinsic evaluation of an ideal MWE recognizer (for only extracting MWEs of type named entities, duplications, numbers, dates and some predefined list of compound prepositions) showed that the preprocessing of the test data would improve the labeled parsing accuracy by 1.5%.