Learning-Based pseudo-relevance feedback for patent retrieval

Authors:
Parvaz Mahdabi;Fabio Crestani
Affiliations:
Faculty of Informatics, University of Lugano, Switzerland;Faculty of Informatics, University of Lugano, Switzerland
Venue:
IRFC'12 Proceedings of the 5th conference on Multidisciplinary Information Retrieval
Year:
2012

Citing 21
Cited 0

Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Query expansion using local and global document analysis

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Relevance based language models

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A study of smoothing methods for language models applied to Ad Hoc information retrieval

Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Predicting query performance

SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Stochastic gradient boosting

Computational Statistics & Data Analysis - Nonlinear methods and data mining
Overview of patent retrieval task at NTCIR-3

PATENT '03 Proceedings of the ACL-2003 workshop on Patent corpus processing - Volume 20
Term distillation in patent retrieval

PATENT '03 Proceedings of the ACL-2003 workshop on Patent corpus processing - Volume 20
Enhancing patent retrieval by citation analysis

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Selecting good expansion terms for pseudo-relevance feedback

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Positional language models for information retrieval

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Transforming patents into prior-art queries

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Reducing the risk of query expansion via robust constrained optimization

Proceedings of the 18th ACM conference on Information and knowledge management
Finding good feedback documents

Proceedings of the 18th ACM conference on Information and knowledge management
PRES: a score metric for evaluating recall-oriented information retrieval applications

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
A unified optimization framework for robust pseudo-relevance feedback algorithms

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Exploring structured documents and query formulation techniques for patent retrieval

CLEF'09 Proceedings of the 10th cross-language evaluation forum conference on Multilingual information access evaluation: text retrieval experiments
Simple vs. sophisticated approaches for patent prior-art search

ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Building queries for prior-art search

IRFC'11 Proceedings of the Second international conference on Multidisciplinary information retrieval facility
Patent query reduction using pseudo relevance feedback

Proceedings of the 20th ACM international conference on Information and knowledge management
Predicting document effectiveness in pseudo relevance feedback

Proceedings of the 20th ACM international conference on Information and knowledge management

Quantified Score

Hi-index	0.00

Visualization

Abstract

Pseudo-relevance feedback (PRF) is an effective approach in Information Retrieval but unfortunately many experiments have shown that PRF is ineffective in patent retrieval. This is because the quality of initial results in the patent retrieval is poor and therefore estimating a relevance model via PRF often hurts the retrieval performance due to off-topic terms. We propose a learning to rank framework for estimating the effectiveness of a patent document in terms of its performance in PRF. Specifically, the knowledge of effective feedback documents on past queries is used to estimate effective feedback documents for new queries. This is achieved by introducing features correlated with feedback document effectiveness. We use patent-specific contents to define such features. We then apply regression to predict document effectiveness given the proposed features. We evaluated the effectiveness of the proposed method on the patent prior art search collection CLEF-IP 2010. Our experimental results show significantly improved retrieval accuracy over a PRF baseline which expands the query using all top-ranked documents.