Learning to recognize tables in free text

  • Authors:
  • Hwee Tou Ng;Chung Yong Lim;Jessica Li Teng Koo

  • Affiliations:
  • DSO National Laboratories, Singapore;DSO National Laboratories, Singapore;DSO National Laboratories, Singapore

  • Venue:
  • ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

Many real-world texts contain tables. In order to process these texts correctly and extract the information contained within the tables, it is important to identify the presence and structure of tables. In this paper, we present a new approach that learns to recognize tables in free text, including the boundary, rows and columns of tables. When tested on Wall Street Journal news documents, our learning approach outperforms a deterministic table recognition algorithm that identifies table recognition algorithm that identifies tables based on a fixed set of conditions. Our learning approach is also more flexible and easily adaptable to texts in different domains with different table characteristics.