Intelligent Indexing and Semantic Retrieval of Multimodal Documents

  • Authors:
  • Rohini K. Srihari;Zhongfei Zhang;Aibing Rao

  • Affiliations:
  • Center for Document Analysis and Recognition (CEDAR), UB Commons, 520 Lee Entrance-Suite 202, State University of New York at Buffalo, Buffalo, NY 14228-2583, USA. rohini@cedar.buffalo.edu;Center for Document Analysis and Recognition (CEDAR), UB Commons, 520 Lee Entrance-Suite 202, State University of New York at Buffalo, Buffalo, NY 14228-2583, USA. zhongfei@cedar.buffalo.edu;Center for Document Analysis and Recognition (CEDAR), UB Commons, 520 Lee Entrance-Suite 202, State University of New York at Buffalo, Buffalo, NY 14228-2583, USA. arao@cedar.buffalo.edu

  • Venue:
  • Information Retrieval
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Finding useful information from large multimodal document collections such as the WWW without encountering numerous false positives poses a challenge to multimedia information retrieval systems (MMIR). This research addresses the problem of finding pictures. The fact that images do not appear in isolation, but rather with accompanying, collateral text is exploited. Taken independently, existing techniques for picture retrieval using (i) text-based and (ii) image-based methods have several limitations. This research presents a general model for multimodal information retrieval that addresses the following issues: (i) users' information need, (ii) expressing information need through composite, multimodal queries, and (iii) determining the most appropriate weighted combination of indexing techniques in order to best satisfy information need. A machine learning approach is proposed for the latter. The focus is on improving precision and recall in a MMIR system by optimally combining text and image similarity. Experiments are presented which demonstrate the utility of individual indexing systems in improving overall average precision.