UTA and SICS at CLEF-IP'09

Authors:
Antti Järvelin;Anni Järvelin;Preben Hansen
Affiliations:
University of Tampere, Department of Information Studies and Interactive Media, University of Tampere, Finland;Swedish Institute of Computer Science, Kista, Sweden and University of Tampere, Department of Information Studies and Interactive Media, University of Tampere, Finland;Swedish Institute of Computer Science, Kista, Sweden
Venue:
CLEF'09 Proceedings of the 10th cross-language evaluation forum conference on Multilingual information access evaluation: text retrieval experiments
Year:
2009

Citing 7
Cited 1

A patent search and classification system

Proceedings of the fourth ACM conference on Digital libraries
Using graded relevance assessments in IR evaluation

Journal of the American Society for Information Science and Technology
Proposal of two-stage patent retrieval method considering the claim structure

ACM Transactions on Asian Language Information Processing (TALIP)
Using score distributions for query-time fusion in multimediaretrieval

MIR '06 Proceedings of the 8th ACM international workshop on Multimedia information retrieval
Technology survey and invalidity search: A comparative study of different tasks for Japanese patent document retrieval

Information Processing and Management: an International Journal
Patent document categorization based on semantic structural information

Information Processing and Management: an International Journal
Focused web crawling in the acquisition of comparable corpora

Information Retrieval

Going beyond CLEF-IP: the 'reality' for patent searchers?

CLEF'12 Proceedings of the Third international conference on Information Access Evaluation: multilinguality, multimodality, and visual analytics

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper reports experiments performed in the course of the CLEF'09 Intellectual Property track, where our main goal was to study automatic query generation from the patent documents. Two simple word weighting algorithms (modified RATF formula, and tf ċ idf) for selecting query keys from the patent documents were tested. Also using different parts of the patent documents as sources of query keys was investigated. Our best runs placed relatively well compared to the other CLEF-IP'09 participants' runs. This suggests that tested approaches to the automatic query generation could be useful, and should be developed further. For three topics, the performance of the automatically extracted queries were compared to queries produced by three patent experts to see whether the automatic key word extraction algorithms seem to be able to extract relevant words from the topics.