Chinese information retrieval based on terms and relevant terms

  • Authors:
  • Yang Lingpeng;Ji Donghong;Tang Li;Niu Zhengyu

  • Affiliations:
  • Institute for Infocomm Research, Singapore;Institute for Infocomm Research, Singapore;Institute for Infocomm Research, Singapore;Institute for Infocomm Research, Singapore

  • Venue:
  • ACM Transactions on Asian Language Information Processing (TALIP)
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this article we describe our approach to Chinese information retrieval, where a query is a short natural language description. First, we use automatically extracted short terms from document sets to build indexes and use the short terms in both the query and documents to do initial retrieval. Next, we use long terms extracted from the document collection to reorder the top N retrieved documents to improve precision. Finally, we acquire the relevant terms of the short terms from the Internet and the top retrieved documents and use them to do query expansion. Experiments on the NTCIR-4 CLIR Chinese SLIR sub-collection show that document reranking can both improve the retrieval performance on its own and make a significant contribution to query expansion. The experiments also show that the extended query expansion proposed in this article is more effective than the standard Rocchio query expansion.