Reassessing and extending the precision and recall concepts

  • Authors:
  • M. H. Heine

  • Affiliations:
  • Department of Information and Library Management, University of Northumbria, Newcastle upon Tyne, England

  • Venue:
  • MIRA'99 Proceedings of the 1999 international conference on Final Mira
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

The contrivances of 'Recall' and 'Precision' are customarily used to assess the effectiveness of document retrieval systems. Despite their extensive use in experiments, including the recent TREC experiments, and their dominance in mathematical discussions of system performance, there has been continual questioning of their validity since their introduction with the Cranfield experiments. This has largely centred on critical analysis of the (pre-mathematical) concept of 'relevance'. Those analyses have now led to the near-consensual view amongst relevance theoreticians that two types of relevance require to be distinguished, namely (1) document 'topicality', where cognition acts as a largely passive receiving agent of knowledge that has an objective, a priori or public character, and (2) 'psychological relevance', where cognition is more actively and creatively involved in the knowledge encoded in the document, i.e. where relevance is non-public (subjective) and conditioned by the user's context and experience (etc) at a particular time. (Various synonyms for these terms have been suggested.) The continued use of P and R in their initial, largely Cranfield, form in document retrieval experiments, especially the continued, uncritical, use of 'Recall', suggest that experimentalists have largely failed to internalise this distinction, since (1) Recall is meaningless under one of these viewpoints, and (2) Precision is ambiguous when the separate validities of each are recognised. Several ways in which this distinction should influence the design of future experiments on document-retrieval/cognition interactions are suggested, involving the choice of, and amendments to the definitions of, these basic measures. Lastly, a generic 3-valued vectorial approach is suggested as a means of integrating both perspectives on 'relevance' within a common evaluative framework.