Improving retrieval effectiveness by using key terms in top retrieved documents

Authors:
Yang Lingpeng;Ji Donghong;Zhou Guodong;Nie Yu
Affiliations:
Institute for Infocomm Research, Singapore;Institute for Infocomm Research, Singapore;Institute for Infocomm Research, Singapore;Institute for Infocomm Research, Singapore
Venue:
ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
Year:
2005

Citing 8
Cited 3

Probabilistic models in information retrieval

The Computer Journal - Special issue on information retrieval
Comparing representations in Chinese information retrieval

Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval
Improving automatic query expansion

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
On the use of words and n-grams for Chinese information retrieval

IRAL '00 Proceedings of the fifth international workshop on on Information retrieval with Asian languages
Improving retrieval feedback with multiple term-ranking function combination

ACM Transactions on Information Systems (TOIS)
Introduction to Modern Information Retrieval

Introduction to Modern Information Retrieval
Query Expansion with Long-Span Collocates

Information Retrieval
Document re-ranking based on automatically acquired key terms in Chinese information retrieval

COLING '04 Proceedings of the 20th international conference on Computational Linguistics

Chinese information retrieval based on terms and relevant terms

ACM Transactions on Asian Language Information Processing (TALIP)
An expansion and reranking approach for annotation-based image retrieval from Web

Expert Systems with Applications: An International Journal
Chinese document re-ranking based on term distribution and maximal marginal relevance

AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose a method to improve the precision of top retrieved documents in Chinese information retrieval where the query is a short description by re-ordering retrieved documents in the initial retrieval. To re-order the documents, we firstly find out terms in query and their importance scales by making use of the information derived from top N(NK(NK) documents by what kinds of terms of query they contain. That is, we first automatically extract key terms from top N retrieved documents, then we collect key terms that occur in query and their document frequencies in the N retrieved documents, finally we use these collected terms to re-order the initially retrieved documents. Each collected term is assigned a weight by its length and its document frequency in top N retrieved documents. Each document is re-ranked by the sum of weights of collected terms it contains. In our experiments on 42 query topics in NTCIR3 Cross Lingual Information Retrieval (CLIR) dataset, an average 17.8%-27.5% improvement can be made for top 10 documents and an average 6.6%-26.9% improvement can be made for top 100 documents at relax/rigid relevance judgment and different parameter setting.