Using the distribution of performance for studying statistical NLP systems and corpora

Authors:
Yuval Krymolowski
Affiliations:
Bar-Ilan University, Ramat Gan, Israel
Venue:
ELDS '01 Proceedings of the workshop on Evaluation for Language and Dialogue Systems - Volume 9
Year:
2001

Citing 7
Cited 0

Learning to resolve natural language ambiguities: a unified approach

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Approximate statistical tests for comparing supervised classification learning algorithms

Neural Computation
Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm

Machine Learning
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
More accurate tests for the statistical significance of result differences

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
A study of cross-validation and bootstrap for accuracy estimation and model selection

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
A bootstrap evaluation of the effect of data splitting on financial time series

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

Statistical NLP systems are frequently evaluated and compared on the basis of their performances on a single split of training and test data. Results obtained using a single split are, however, subject to sampling noise. In this paper we argue in favour of reporting a distribution of performance figures, obtained by resampling the training data, rather than a single number. The additional information from distributions can be used to make statistically quantified statements about differences across parameter settings, systems, and corpora.