Using information gain to improve multi-modal information retrieval systems

  • Authors:
  • M. T. Martín-Valdivia;M. C. Díaz-Galiano;A. Montejo-Raez;L. A. Ureña-López

  • Affiliations:
  • Departamento de Informática, Campus Las Lagunillas, s/n., University of Jaén, Jaén E-23071, Spain;Departamento de Informática, Campus Las Lagunillas, s/n., University of Jaén, Jaén E-23071, Spain;Departamento de Informática, Campus Las Lagunillas, s/n., University of Jaén, Jaén E-23071, Spain;Departamento de Informática, Campus Las Lagunillas, s/n., University of Jaén, Jaén E-23071, Spain

  • Venue:
  • Information Processing and Management: an International Journal
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Nowadays, access to information requires managing multimedia databases effectively, and so, multi-modal retrieval techniques (particularly images retrieval) have become an active research direction. In the past few years, a lot of content-based image retrieval (CBIR) systems have been developed. However, despite the progress achieved in the CBIR, the retrieval accuracy of current systems is still limited and often worse than only textual information retrieval systems. In this paper, we propose to combine content-based and text-based approaches to multi-modal retrieval in order to achieve better results and overcome the lacks of these techniques when they are taken separately. For this purpose, we use a medical collection that includes both images and non-structured text. We retrieve images from a CBIR system and textual information through a traditional information retrieval system. Then, we combine the results obtained from both systems in order to improve the final performance. Furthermore, we use the information gain (IG) measure to reduce and improve the textual information included in multi-modal information retrieval systems. We have carried out several experiments that combine this reduction technique with a visual and textual information merger. The results obtained are highly promising and show the profit obtained when textual information is managed to improve conventional multi-modal systems.