Evaluating information extraction

Authors:
Andrea Esuli;Fabrizio Sebastiani
Affiliations:
Istituto di Scienza e Tecnologie dell'Informazione, Consiglio Nazionale delle Ricerche, Pisa, Italy;Istituto di Scienza e Tecnologie dell'Informazione, Consiglio Nazionale delle Ricerche, Pisa, Italy
Venue:
CLEF'10 Proceedings of the 2010 international conference on Multilingual and multimodal information access evaluation: cross-language evaluation forum
Year:
2010

Citing 11
Cited 1

Evaluating and optimizing autonomous text classification systems

SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Machine Learning for Information Extraction in Informal Domains

Machine Learning - Special issue on information retrieval
Machine learning in automated text categorization

ACM Computing Surveys (CSUR)
Boosted Wrapper Induction

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
A support vector method for multivariate performance measures

ICML '05 Proceedings of the 22nd international conference on Machine learning
Introduction to the CoNLL-2002 shared task: language-independent named entity recognition

COLING-02 proceedings of the 6th conference on Natural language learning - Volume 20
Introduction to the CoNLL-2003 shared task: language-independent named entity recognition

CONLL '03 Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4
An introduction to ROC analysis

Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
Training conditional random fields with multivariate evaluation measures

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
An effective two-stage model for exploiting non-local dependencies in named entity recognition

ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
Information Extraction

Foundations and Trends in Databases

Enhancing opinion extraction by automatically annotated lexical resources

LTC'09 Proceedings of the 4th conference on Human language technology: challenges for computer science and linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

The issue of how to experimentally evaluate information extraction (IE) systems has received hardly any satisfactory solution in the literature. In this paper we propose a novel evaluation model for IE and argue that, among others, it allows (i) a correct appreciation of the degree of overlap between predicted and true segments, and (ii) a fair evaluation of the ability of a system to correctly identify segment boundaries. We describe the properties of this models, also by presenting the result of a re-evaluation of the results of the CoNLL'03 and CoNLL'02 Shared Tasks on Named Entity Extraction.