Discourse segmentation of german written texts

  • Authors:
  • Harald Lüngen;Csilla Puskás;Maja Bärenfänger;Mirco Hilbert;Henning Lobin

  • Affiliations:
  • FB 05 – Applied and Computational Linguistics, Justus-Liebig-Universität Gießen;FB 05 – Applied and Computational Linguistics, Justus-Liebig-Universität Gießen;FB 05 – Applied and Computational Linguistics, Justus-Liebig-Universität Gießen;FB 05 – Applied and Computational Linguistics, Justus-Liebig-Universität Gießen;FB 05 – Applied and Computational Linguistics, Justus-Liebig-Universität Gießen

  • Venue:
  • FinTAL'06 Proceedings of the 5th international conference on Advances in Natural Language Processing
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Discourse segmentation is the division of a text into minimal discourse segments, which form the leaves in the trees that are used to represent discourse structures. A definition of elementary discourse segments in German is provided by adapting widely used segmentation principles for English minimal units, while considering punctuation, morphology, sytax, and aspects of the logical document structure of a complex text type, namely scientific articles. The algorithm and implementation of a discourse segmenter based on these principles is presented, as well an evaluation of test runs.