Structuring documents according to their table of contents

  • Authors:
  • Hervé Déjean;Jean-Luc Meunier

  • Affiliations:
  • Xerox Research Centre Europe, Meylan, France;Xerox Research Centre Europe, Meylan, France

  • Venue:
  • Proceedings of the 2005 ACM symposium on Document engineering
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we present a method for structuring a document according to the information present in its Table of Contents. The detection of the ToC as well as the determination of the parts it refers to in the document body rely on a series of generic properties characterizing any ToC, while its hierarchization is achieved using clustering techniques. We also report on the robustness and performance of the method before discussing it, in light of related work.