An empirical study on retrieval models for different document genres: patents and newspaper articles

Authors:
Makoto Iwayama;Atsushi Fujii;Noriko Kando;Yuzo Marukawa
Affiliations:
Hitachi, Ltd., Kokubunji, Japan;University of Tsukuba, Tsukuba, Japan and CREST, Japan Science and Technology Corporation;National Institute of Informatics, Chiyoda-ku, Japan;National Institute of Informatics, Chiyoda-ku, Japan
Venue:
Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Year:
2003

Citing 7
Cited 14

Presenting results of experimental retrieval comparisons

Information Processing and Management: an International Journal - Special issue on evaluation issues in information retrieval
Using statistical testing in the evaluation of retrieval experiments

SIGIR '93 Proceedings of the 16th annual international ACM SIGIR conference on Research and development in information retrieval
Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval

SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Pivoted document length normalization

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
On relevance weights with little relevance information

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Exploring the similarity space

ACM SIGIR Forum
The SMART Retrieval System—Experiments in Automatic Document Processing

The SMART Retrieval System—Experiments in Automatic Document Processing

Report on the patent retrieval task at NTCIR workshop 3

ACM SIGIR Forum
Associative document retrieval by query subtopic analysis and its application to invalidity patent search

Proceedings of the thirteenth ACM international conference on Information and knowledge management
Cross-language information retrieval: the way ahead

Information Processing and Management: an International Journal - Special issue: Cross-language information retrieval
Revisiting Document Length Hypotheses: A Comparative Study of Japanese Newspaper and Patent Retrieval

ACM Transactions on Asian Language Information Processing (TALIP)
Adapting pivoted document-length normalization for query size: Experiments in Chinese and English

ACM Transactions on Asian Language Information Processing (TALIP)
Cluster-based patent retrieval

Information Processing and Management: an International Journal
Advanced learning algorithms for cross-language patent retrieval and classification

Information Processing and Management: an International Journal
Towards a unified approach to document similarity search using manifold-ranking of blocks

Information Processing and Management: an International Journal
Effective XML content and structure retrieval with relevance ranking

Proceedings of the 18th ACM conference on Information and knowledge management
A graph-theoretic framework for semantic distance

Computational Linguistics
A vector space analysis of swedish patent claims with different linguistic indices

PaIR '10 Proceedings of the 3rd international workshop on Patent information retrieval
Markov graphic method for information retrieval

AICI'11 Proceedings of the Third international conference on Artificial intelligence and computational intelligence - Volume Part II
Cluster-based patent retrieval using international patent classification system

ICCPOL'06 Proceedings of the 21st international conference on Computer Processing of Oriental Languages: beyond the orient: the research challenges ahead
A patent system ontology for facilitating retrieval of patent related information

Proceedings of the 6th International Conference on Theory and Practice of Electronic Governance

Quantified Score

Hi-index	0.00

Visualization

Abstract

Reflecting the rapid growth in the utilization of large test collections for information retrieval since the 1990s, extensive comparative experiments have been performed to explore the effectiveness of various retrieval models. However, most collections were intended for retrieving newspaper articles and technical abstracts. In this paper, we describe the process of producing a test collection for patent retrieval, the NTCIR-3 Patent Retrieval Collection, which includes two years of Japanese patent applications and 31 topics produced by professional patent searchers. We also report experimental results obtained by using this collection to re-examine the effectiveness of existing retrieval models in the context of patent retrieval. The relative superiority among existing retrieval models did not significantly differ depending on the document genre, that is, patents and newspaper articles. Issues related to patent retrieval are also discussed.