Visually guided bottom-up table detection and segmentation in web documents

  • Authors:
  • Bernhard Krüpl;Marcus Herzog

  • Affiliations:
  • Vienna University of Technology;Vienna University of Technology

  • Venue:
  • Proceedings of the 15th international conference on World Wide Web
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the AllRight project, we are developing an algorithm for unsupervised table detection and segmentation that uses the visual rendition of a Web page rather than the HTML code. Our algorithm works bottom-up by grouping word bounding boxes into larger groups and uses a set of heuristics. It has already been implemented and a preliminary evaluation on about 6000 Web documents has been carried out.