Improving information retrieval system performance by combining different text-mining techniques

  • Authors:
  • Rila Mandala;Takenobu Tokunaga;Hozumi Tanaka

  • Affiliations:
  • Department of Informatics, Institute of Technology Bandung, Jalan Ganesha 10, Bandung 40132, Indonesia. E-mail rila@if.itb.ac.id;Department of Computer Science, Tokyo Institute of Technology, 2-12-1 Oookayama Meguro-Ku, Tokyo 152-8554, Japan. E-mail: {take, tanaka}@cl.cs.titech.ac.jp;Department of Computer Science, Tokyo Institute of Technology, 2-12-1 Oookayama Meguro-Ku, Tokyo 152-8554, Japan. E-mail: {take, tanaka}@cl.cs.titech.ac.jp

  • Venue:
  • Intelligent Data Analysis
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

WordNet, a hand-made, general-purpose, and machine-readable thesaurus, has been used in information retrieval research by many researchers, but failed to improve the performance of their retrieval system. Thereby in this paper we investigate why the use of WordNet has not been successful. Based on this analysis we propose a method of making WordNet more useful in information retrieval applications by combining it with other knowledge resources. A simple word sense disambiguation is performed to avoid misleading expansion terms. Experiments using several standard information retrieval test collections show that our method results in a significant improvement of information retrieval performance. Failure analysis were done on the cases in which the proposed method fail to improve the retrieval effectiveness. We found that queries containing negative statements and multiple aspects might cause problems in the proposed method and we also investigated the solution to these problems.