Information Retrieval
Automatic language and information processing: rethinking evaluation
Natural Language Engineering
Evaluating the evaluation: a case study using the TREC 2002 question answering track
NAACL '03 Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1
Hi-index | 0.00 |
F-measure is an indicator used since 25 years to evaluate classification algorithms in textmining, from precision and recall. For classification and information retrieval, some ones prefer to use the break even point. Nevertheless, these measures have some inconvenient: they use a binary logic and don't allow applying a user (judge) assessment. This paper proposes a new approach of evaluation. First, we distinguish classification and categorization from a semantic point of view. Then, we introduce a new measure: the K-measure, which is an overall of F-measure and break even point, and allows applying user requirements. Finally, we propose a methodology for evaluation.