Using statistical testing in the evaluation of retrieval experiments
SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Statistical inference in retrieval effectiveness evaluation
Information Processing and Management: an International Journal
Making large-scale support vector machine learning practical
Advances in kernel methods
Information Retrieval
Cumulated gain-based evaluation of IR techniques
ACM Transactions on Information Systems (TOIS)
More accurate tests for the statistical significance of result differences
COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Significance tests for the evaluation of ranking methods
COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Advances in Web Semantics I
A systematic analysis of performance measures for classification tasks
Information Processing and Management: an International Journal
ILP-based concept discovery in multi-relational data mining
Expert Systems with Applications: An International Journal
Linking Life Sciences Data Using Graph-Based Mapping
DILS '09 Proceedings of the 6th International Workshop on Data Integration in the Life Sciences
A Survey of Accuracy Evaluation Metrics of Recommendation Tasks
The Journal of Machine Learning Research
Uncovering age-specific invasive and DCIS breast cancer rules using inductive logic programming
Proceedings of the 1st ACM International Health Informatics Symposium
System implementation and adaptation evaluation in adaptive web-based systems
Proceedings of the 12th International Conference on Computer Systems and Technologies
Optimizing potential information transfer with self-referential memory
UC'06 Proceedings of the 5th international conference on Unconventional Computation
Journal of Biomedical Informatics
Location-based reasoning about complex multi-agent behavior
Journal of Artificial Intelligence Research
Relational differential prediction
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part I
Computing precision and recall with missing or uncertain ground truth
GREC'11 Proceedings of the 9th international conference on Graphics Recognition: new trends and challenges
Rhetorics-based multi-document summarization
Expert Systems with Applications: An International Journal
Sequential testing in classifier evaluation yields biased estimates of effectiveness
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
International Journal of Knowledge Discovery in Bioinformatics
Hi-index | 0.00 |
We address the problems of 1/ assessing the confidence of the standard point estimates, precision, recall and F-score, and 2/ comparing the results, in terms of precision, recall and F-score, obtained using two different methods. To do so, we use a probabilistic setting which allows us to obtain posterior distributions on these performance indicators, rather than point estimates. This framework is applied to the case where different methods are run on different datasets from the same source, as well as the standard situation where competing results are obtained on the same data.