Building queries for prior-art search

Authors:
Parvaz Mahdabi;Mostafa Keikha;Shima Gerani;Monica Landoni;Fabio Crestani
Affiliations:
Faculty of Informatics, University of Lugano, Lugano, Switzerland;Faculty of Informatics, University of Lugano, Lugano, Switzerland;Faculty of Informatics, University of Lugano, Lugano, Switzerland;Faculty of Informatics, University of Lugano, Lugano, Switzerland;Faculty of Informatics, University of Lugano, Lugano, Switzerland
Venue:
IRFC'11 Proceedings of the Second international conference on Multidisciplinary information retrieval facility
Year:
2011

Citing 16
Cited 4

Foundations of statistical natural language processing

Foundations of statistical natural language processing
Parsimonious language models for information retrieval

Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Associative document retrieval by query subtopic analysis and its application to invalidity patent search

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Proposal of two-stage patent retrieval method considering the claim structure

ACM Transactions on Asian Language Information Processing (TALIP)
Introduction to the special issue on patent processing

Information Processing and Management: an International Journal
Enhancing patent retrieval by citation analysis

SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Retrievability: an evaluation measure for higher order information access tasks

Proceedings of the 17th ACM conference on Information and knowledge management
Toward a more rational patent search paradigm

Proceedings of the 1st ACM workshop on Patent information retrieval
Transforming patents into prior-art queries

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
A query model based on normalized log-likelihood

Proceedings of the 18th ACM conference on Information and knowledge management
Automatic query generation for patent search

Proceedings of the 18th ACM conference on Information and knowledge management
PRES: a score metric for evaluating recall-oriented information retrieval applications

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Search system requirements of patent analysts

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Examining the robustness of evaluation metrics for patent retrieval with incomplete relevance judgements

CLEF'10 Proceedings of the 2010 international conference on Multilingual and multimodal information access evaluation: cross-language evaluation forum
Improving retrievability of patents in prior-art search

ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
Knowledge modeling in prior art search

IRFC'10 Proceedings of the First international Information Retrieval Facility conference on Adbances in Multidisciplinary Retrieval

Effective query generation and postprocessing strategies for prior art patent search

Journal of the American Society for Information Science and Technology
Automatic refinement of patent queries using concept importance predictors

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Learning-Based pseudo-relevance feedback for patent retrieval

IRFC'12 Proceedings of the 5th conference on Multidisciplinary Information Retrieval
A hybrid keyword and patent class methodology for selecting relevant sets of patents for a technological field

Scientometrics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Prior-art search is a critical step in the examination procedure of a patent application. This study explores automatic query generation from patent documents to facilitate the time-consuming and labor-intensive search for relevant patents. It is essential for this task to identify discriminative terms in different fields of a query patent, which enables us to distinguish relevant patents from non-relevant patents. To this end we investigate the distribution of terms occurring in different fields of the query patent and compare the distributions with the rest of the collection using language modeling estimation techniques. We experiment with term weighting based on the Kullback-Leibler divergence between the query patent and the collection and also with parsimonious language model estimation. Both of these techniques promote words that are common in the query patent and are rare in the collection. We also incorporate the classification assigned to patent documents into our model, to exploit available human judgements in the form of a hierarchical classification. Experimental results show that the retrieval using the generated queries is effective, particularly in terms of recall, while patent description is shown to be the most useful source for extracting query terms.