PATATRAS: retrieval model combination and regression models for prior art search

Authors:
Patrice Lopez;Laurent Romary
Affiliations:
-;Humboldt Universität zu Berlin, Institut für Deutsche Sprache und Linguistik and INRIA
Venue:
CLEF'09 Proceedings of the 10th cross-language evaluation forum conference on Multilingual information access evaluation: text retrieval experiments
Year:
2009

Citing 3
Cited 2

Translating collocations for bilingual lexicons: a statistical approach

Computational Linguistics
Combining Multiple Strategies for Effective Monolingual and Cross-Language Retrieval

Information Retrieval
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)

Going beyond CLEF-IP: the 'reality' for patent searchers?

CLEF'12 Proceedings of the Third international conference on Information Access Evaluation: multilinguality, multimodality, and visual analytics
Leveraging conceptual lexicon: query disambiguation using proximity information for patent retrieval

Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents PATATRAS (PATent and Article Tracking, Retrieval and AnalysiS), a system realized at the Humboldt University for the IP track of CLEF 2009. Our approach presents three main characteristics: 1. The usage of multiple retrieval models and term index definitions for the three languages considered in the present track producing ten different sets of ranked results. 2. The merging of the different results based on multiple regression models using an additional training set created from the patent collection. 3. The exploitation of patent metadata and the citation structures for creating restricted initial working sets of patents and for producing a final re-ranking regression model. The resulting architecture allowed us to exploit efficiently specific information of patent documents while remaining generic and easy to extend.