A nugget-based test collection construction paradigm

  • Authors:
  • Shahzad Rajput;Virgil Pavlu;Peter B. Golbus;Javed A. Aslam

  • Affiliations:
  • Northeastern University, Boston, MA, USA;Northeastern University, Boston, MA, USA;Northeastern University, Boston, MA, USA;Northeastern University, Boston, MA, USA

  • Venue:
  • Proceedings of the 20th ACM international conference on Information and knowledge management
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

The problem of building test collections is central to the development of information retrieval systems such as search engines. Starting with a few relevant "nuggets" of information manually extracted from existing TREC corpora, we implement and test a methodology that finds and correctly assesses the vast majority of relevant documents found by TREC assessors - as well as up to four times more additional relevant documents. Our methodology produces highly accurate test collections that hold the promise of addressing the issues of scalability, reusability, and applicability.