Automatic Extraction of Data from 2-D Plots in Documents

  • Authors:
  • X. Lu;J. Wang;P. Mitra;C. L. Giles

  • Affiliations:
  • The Pennsylvania State University, University Park, PA, USA;The Pennsylvania State University, University Park, PA, USA;The Pennsylvania State University, University Park, PA, USA;The Pennsylvania State University, University Park, PA, USA

  • Venue:
  • ICDAR '07 Proceedings of the Ninth International Conference on Document Analysis and Recognition - Volume 01
  • Year:
  • 2007

Quantified Score

Hi-index 0.05

Visualization

Abstract

Two-dimensional (2-D) plots in digital documents contain important information. Often, the results of scientific experiments and performance of businesses are summarized using plots. Although 2-D plots are easily understood by human users, current search engines rarely utilize the information contained in the plots to enhance the results returned in response to queries posed by end- users. We propose an automated algorithm for extracting information from line curves in 2-D plots. The extracted information can be stored in a database and indexed to answer end-user queries and enhance search results. We have collected 2-D plot images from a variety of resources and tested our extraction algorithms. Experimental evaluation has demonstrated that our method can produce results suitable for real world use.