Functional-Based Table Category Identification in Digital Library

  • Authors:
  • Seongchan Kim;Ying Liu

  • Affiliations:
  • -;-

  • Venue:
  • ICDAR '11 Proceedings of the 2011 International Conference on Document Analysis and Recognition
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Better understanding the document logical components is crucial to many applications, e.g., document classification or data integration. As the development of digital libraries, more people realize the importance of the scientific tables, which contain valuable information concisely. Although tons of previous table works focus on table data extraction, few concrete works on understanding and utilizing the extracted table data exist. Based on a large-scaled quantitative study on scientific papers, we believe that identifying the original purpose of the table authors can improve the table data comprehension and facilitate the table data reusability. In this paper, scientific document tables are classified into three topical categories: background, system/method, and experimental, and two functional categories: commentary and comparison. We apply machine learning based methods to implement the table classification task. Our results demonstrate that the proposed features are effective in the classification performance and our proposed method outperforms the rule-based baseline significantly.