Correlating natural language parser performance with statistical measures of the text

Authors:
Yi Zhang;Rui Wang
Affiliations:
LT-Lab, German Research Center for Artificial Intelligence, Computational Linguistics, Saarland University;Computational Linguistics, Saarland University
Venue:
KI'09 Proceedings of the 32nd annual German conference on Advances in artificial intelligence
Year:
2009

Citing 12
Cited 0

Three generative, lexicalised models for statistical parsing

ACL '98 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics
Accurate unlexicalized parsing

ACL '03 Proceedings of the 41st Annual Meeting on Association for Computational Linguistics - Volume 1
Intricacies of Collins' Parsing Model

Computational Linguistics
Reranking and self-training for parser adaptation

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Non-projective dependency parsing using spanning tree algorithms

HLT '05 Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing
Spoken language processing: Piecing together the puzzle

Speech Communication
Evaluating contributions of natural language parsers to protein–protein interaction extraction

Bioinformatics
When is self-training effective for parsing?

COLING '08 Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1
Structural correspondence learning for parse disambiguation

EACL '09 Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop
Porting a lexicalized-grammar parser to the biomedical domain

Journal of Biomedical Informatics
MAP adaptation of stochastic grammars

Computer Speech and Language
Adapting a probabilistic disambiguation model of an HPSG parser to a new domain

IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Natural language parsing, as one of the central tasks in natural language processing, is widely used in many AI fields. In this paper, we address an issue of parser performance evaluation, particularly its variation across datasets. We propose three simple statistical measures to characterize the datasets and also evaluate their correlation to the parser performance. The results clearly show that different parsers have different performance variation and sensitivity against these measures. The method can be used to guide the choice of natural language parsers for new domain applications, as well as systematic combination for better parsing accuracy.