Automatic Structuring of Written Texts

  • Authors:
  • Marek Veber;Ales Horák;Rostislav Julinek;Pavel Smrz

  • Affiliations:
  • -;-;-;-

  • Venue:
  • TSD '99 Proceedings of the Second International Workshop on Text, Speech and Dialogue
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper deals with automatic structuring and sentence boundary labelling in natural language texts. We describe the implemented structure tagging algorithm and heuristic rules that are used for automatic or semiautomatic labelling. Inside the detected sentence the algorithm performs a decomposition to clauses and then marks the parts of text which do not form a sentence, i.e. headings, signatures, tables and other structured data. We also pay attention to the processing of matched symbols in the text, especially to the analysis of direct speech notation.