Exploring text and image features to classify images in bioscience literature

  • Authors:
  • Barry Rafkind;Minsuk Lee;Shih-Fu Chang;Hong Yu

  • Affiliations:
  • Columbia University, New York, NY;University of Wisconsin-Milwaukee, Milwaukee, WI;Columbia University, New York, NY;University of Wisconsin-Milwaukee, Milwaukee, WI

  • Venue:
  • BioNLP '06 Proceedings of the Workshop on Linking Natural Language Processing and Biology: Towards Deeper Biological Literature Analysis
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

A picture is worth a thousand words. Biomedical researchers tend to incorporate a significant number of images (i.e., figures or tables) in their publications to report experimental results, to present research models, and to display examples of biomedical objects. Unfortunately, this wealth of information remains virtually inaccessible without automatic systems to organize these images. We explored supervised machine-learning systems using Support Vector Machines to automatically classify images into six representative categories based on text, image, and the fusion of both. Our experiments show a significant improvement in the average F-score of the fusion classifier (73.66%) as compared with classifiers just based on image (50.74%) or text features (68.54%).