COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
Selecting typical instances in instance-based learning
ML92 Proceedings of the ninth international workshop on Machine learning
Machine Learning
Prototype selection for composite nearest neighbor classifiers
Prototype selection for composite nearest neighbor classifiers
Learning to resolve natural language ambiguities: a unified approach
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Forgetting Exceptions is Harmful in Language Learning
Machine Learning - Special issue on natural language learning
Query Learning Strategies Using Boosting and Bagging
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
A decision-theoretic generalization of on-line learning and an application to boosting
EuroCOLT '95 Proceedings of the Second European Conference on Computational Learning Theory
Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
Introduction to the CoNLL-2000 shared task: chunking
ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
Assessing system agreement and instance difficulty in the lexical sample tasks of SENSEVAL-2
WSD '02 Proceedings of the ACL-02 workshop on Word sense disambiguation: recent successes and future directions - Volume 8
Hi-index | 0.00 |
Error analysis is a key step in developing statistical parsers. In doing this, we manually discover typical cases by examining parser output. In this paper we argue that the process can be speeded up by considering the output from an ensemble of parsers. We do this by resampling small proportions (10% and up) from the training data, and exploiting the high diversity of the resulting parsers - resulting from the sparseness of natural-language data. Varying the sample size, we can trace the gradual learning of each instance and classify instances into a few types. This division helps in distinguishing instances which are hard for the system, from instances which may be learned in principle. We suggest that such analysis can yield a qualitative approach to evaluation of statistical parsers.