Learning-Based pseudo-relevance feedback for patent retrieval

  • Authors:
  • Parvaz Mahdabi;Fabio Crestani

  • Affiliations:
  • Faculty of Informatics, University of Lugano, Switzerland;Faculty of Informatics, University of Lugano, Switzerland

  • Venue:
  • IRFC'12 Proceedings of the 5th conference on Multidisciplinary Information Retrieval
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Pseudo-relevance feedback (PRF) is an effective approach in Information Retrieval but unfortunately many experiments have shown that PRF is ineffective in patent retrieval. This is because the quality of initial results in the patent retrieval is poor and therefore estimating a relevance model via PRF often hurts the retrieval performance due to off-topic terms. We propose a learning to rank framework for estimating the effectiveness of a patent document in terms of its performance in PRF. Specifically, the knowledge of effective feedback documents on past queries is used to estimate effective feedback documents for new queries. This is achieved by introducing features correlated with feedback document effectiveness. We use patent-specific contents to define such features. We then apply regression to predict document effectiveness given the proposed features. We evaluated the effectiveness of the proposed method on the patent prior art search collection CLEF-IP 2010. Our experimental results show significantly improved retrieval accuracy over a PRF baseline which expands the query using all top-ranked documents.