IR system evaluation using nugget-based test collections

  • Authors:
  • Virgil Pavlu;Shahzad Rajput;Peter B. Golbus;Javed A. Aslam

  • Affiliations:
  • Northeastern University, Boston, MA, USA;Northeastern University, Boston, MA, USA;Northeastern University, Boston, MA, USA;Northeastern University, Boston, MA, USA

  • Venue:
  • Proceedings of the fifth ACM international conference on Web search and data mining
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

The development of information retrieval systems such as search engines relies on good test collections, including assessments of retrieved content. The widely employed Cranfield paradigm dictates that the information relevant to a topic be encoded at the level of documents, therefore requiring effectively complete document relevance assessments. As this is no longer practical for modern corpora, numerous problems arise, including scalability, reusability, and applicability. We propose a new method for relevance assessment based on relevant information, not relevant documents. Once the relevant 'nuggets' are collected, our matching method can assess any document for relevance with high accuracy, and so any retrieved list of documents can be assessed for performance. In this paper we analyze the performance of the matching function by looking at specific cases and by comparing with other methods. We then show how these inferred relevance assessments can be used to perform IR system evaluation, and we discuss in particular reusability and scalability. Our main contribution is a methodology for producing test collections that are highly accurate, more complete, scalable, reusable, and can be generated with similar amounts of effort as existing methods, with great potential for future applications.