Large Margin Classification Using the Perceptron Algorithm
Machine Learning - The Eleventh Annual Conference on computational Learning Theory
Head-driven statistical models for natural language parsing
Head-driven statistical models for natural language parsing
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
A test of the leaf-ancestor metric for parse accuracy
Natural Language Engineering
A maximum-entropy-inspired parser
NAACL 2000 Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference
XTAG system: a wide coverage grammar for English
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 2
ACL '04 Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics
Learning accurate, compact, and interpretable tree annotation
ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics
Scalable discriminative parsing for German
IWPT '09 Proceedings of the 11th International Conference on Parsing Technologies
Improving generative statistical parsing with semi-supervised word clustering
IWPT '09 Proceedings of the 11th International Conference on Parsing Technologies
Cross parser evaluation and tagset variation: a French treebank study
IWPT '09 Proceedings of the 11th International Conference on Parsing Technologies
On statistical parsing of French with supervised and semi-supervised strategies
CLAGI '09 Proceedings of the EACL 2009 Workshop on Computational Linguistic Aspects of Grammatical Inference
Statistical parsing of morphologically rich languages (SPMRL): what, how and whither
SPMRL '10 Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages
SPMRL '10 Proceedings of the NAACL HLT 2010 First Workshop on Statistical Parsing of Morphologically-Rich Languages
Benchmarking of statistical dependency parsers for French
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
French parsing enhanced with a word clustering method based on a syntactic lexicon
SPMRL '11 Proceedings of the Second Workshop on Statistical Parsing of Morphologically Rich Languages
Parsing models for identifying multiword expressions
Computational Linguistics
Hi-index | 0.00 |
This paper shows that training a lexicalized parser on a lemmatized morphologically-rich treebank such as the French Treebank slightly improves parsing results. We also show that lemmatizing a similar in size subset of the English Penn Treebank has almost no effect on parsing performance with gold lemmas and leads to a small drop of performance when automatically assigned lemmas and POS tags are used. This highlights two facts: (i) lemmatization helps to reduce lexicon data-sparseness issues for French, (ii) it also makes the parsing process sensitive to correct assignment of POS tags to unknown words.