FigSearch: a figure legend indexing and classification system

  • Authors:
  • Fang Liu;Tor-Kristian Jenssen;Vegard Nygaard;John Sack;Eivind Hovig

  • Affiliations:
  • Department of Tumor Biology, Institute for Cancer Research, The Norwegian Radium Hospital, Montebello, 0310 Oslo, Norway,;PubGene AS, Forskningsveien 2A, P.O. BOX 180 Vinderen, N-0319 Oslo, Norway;Department of Tumor Biology, Institute for Cancer Research, The Norwegian Radium Hospital, Montebello, 0310 Oslo, Norway,;Stanford University, HighWire Press, 1454 Page Mill Road, Palo Alto, CA 94304, USA;Department of Tumor Biology, Institute for Cancer Research, The Norwegian Radium Hospital, Montebello, 0310 Oslo, Norway,

  • Venue:
  • Bioinformatics
  • Year:
  • 2004

Quantified Score

Hi-index 3.84

Visualization

Abstract

Summary: FigSearch is a prototype text-mining and classification system for figures from any corpus of full-text biological papers. The system allows users to search for figures that contain genes of interest and illustrate protein interactions. The retrieved figures are ranked by a score representing the likelihood to be of a certain type, in this case, schematic illustrations of protein interactions and signaling events. The system contains a Web interface for search, a module for classification of figures based on vector representations of figure legends and a module for indexing gene names. In a preliminary validation, the FigSearch system showed satisfactory performance according to domain experts in providing the most relevant graphical representations. This strategy may be easily extended to other figure types. Moreover, as more full-text data become available, such a system will find increased usefulness in identifying and presenting compressed biological knowledge. Availability: A searchable Web interface, FigSearch, is accessible via http://pubgeneserver.uio.no/figsearch/ for all figures from the available corpus.