Document reranking by term distribution and maximal marginal relevance for Chinese information retrieval

Authors:
Lingpeng Yang;Donghong Ji;Munkew Leong
Affiliations:
Institute for Infocomm Research, Media Understanding, Singapore, Singapore;Institute for Infocomm Research, Media Understanding, Singapore, Singapore;Institute for Infocomm Research, Media Understanding, Singapore, Singapore
Venue:
Information Processing and Management: an International Journal - Special issue: AIRS2005: Information retrieval research in Asia
Year:
2007

Citing 8
Cited 2

Query expansion using local and global document analysis

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Improving automatic query expansion

Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Improving the effectiveness of information retrieval with local context analysis

ACM Transactions on Information Systems (TOIS)
On the use of words and n-grams for Chinese information retrieval

IRAL '00 Proceedings of the fifth international workshop on on Information retrieval with Asian languages
Re-ranking model based on document clusters

Information Processing and Management: an International Journal
Improving the retrieval effectiveness of very short queries

Information Processing and Management: an International Journal
Re-ranking method based on inter-document distances

Information Processing and Management: an International Journal
Document re-ranking based on automatically acquired key terms in Chinese information retrieval

COLING '04 Proceedings of the 20th international conference on Computational Linguistics

Web-based pattern learning for named entity translation in Korean-Chinese cross-language information retrieval

Expert Systems with Applications: An International Journal
Learning weights for translation candidates in Japanese-Chinese information retrieval

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this paper, we propose a document reranking method for Chinese information retrieval. The method is based on a term weighting scheme, which integrates local and global distribution of terms as well as document frequency, document positions and term length. The weight scheme allows randomly setting a larger portion of the retrieved documents as relevance feedback, and lifts off the worry that very fewer relevant documents appear in top retrieved documents. It also helps to improve the performance of maximal marginal relevance (MMR) in document reranking. The method was evaluated by MAP (mean average precision), a recall-oriented measure. Significance tests showed that our method can get significant improvement against standard baselines, and outperform relevant methods consistently.