Automated repurposing of implicitly structured documents

  • Authors:
  • Helen Balinsky;Anthony Wiley;Michael Rhodes;Alfie Abdul-Rahman

  • Affiliations:
  • Hewlett-Packard Labs, Bristol, United Kingdom;Hewlett-Packard Labs, Bristol, United Kingdom;Hewlett-Packard Labs, Bristol, United Kingdom;Hewlett-Packard Labs, Bristol, United Kingdom

  • Venue:
  • Proceedings of the eighth ACM symposium on Document engineering
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The different visual cues present in a document - such as spatial intervals and positions, contrast in font families, sizes and weights - combine to form the document's visual hierarchy. This hierarchy is essential to the reader, allowing scanning and comprehension; in contrast, this information is often ignored by machine processing. At the same time, the document structure is often not available in a machine readable form due to the ways documents were originally created or later transformed. This paper addresses the challenge of automatic document repurposing - applying styling and formatting from one 'implicitly' structured document to another, whilst preserving the underlying visual hierarchy. Using visual perception analysis, the proportionality mapping is established, according to which the original document content is transformed into the new style without breaking the original hierarchical structure. Spatial relationships, location and frequency analysis are then used to fine-tune the transformation.