Prior art search using international patent classification codes and all-claims-queries

  • Authors:
  • Benjamin Herbert;György Szarvas;Iryna Gurevych

  • Affiliations:
  • Ubiquitous Knowledge Processing Lap, Computer Science Department, Technische Universität Darmstadt, Darmstadt, Germany;Ubiquitous Knowledge Processing Lap, Computer Science Department, Technische Universität Darmstadt, Darmstadt, Germany;Ubiquitous Knowledge Processing Lap, Computer Science Department, Technische Universität Darmstadt, Darmstadt, Germany

  • Venue:
  • CLEF'09 Proceedings of the 10th cross-language evaluation forum conference on Multilingual information access evaluation: text retrieval experiments
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we describe the system we developed for the Intellectual Property track of the 2009 Cross-Language Evaluation Forum. The track addressed prior art search for patent applications. We used the Lucene library to conduct experiments with the traditional TFIDF-based ranking approach, indexing both the textual content and the IPC codes assigned to each document. We formulated our queries by using the title and claims of a patent application in order to measure the (weighted) lexical overlap between topics and prior art candidates. We also formulated a language-independent query using the IPC codes of a document to improve the coverage and to obtain a more accurate ranking of candidates. Using a simple model, our system remained efficient and had a reasonably good performance score: it achieved the 6th best Mean Average Precision score out of 14 participating systems on 500 topics, and the 4th best score out of 9 participants on 10,000 topics.