The simple core and the complex periphery of natural language a formal and a computational view

  • Authors:
  • Petr Sgall;Alena Böhmová

  • Affiliations:
  • Charles University Prague, Czech Rep.;Charles University Prague, Czech Rep.

  • Venue:
  • COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
  • Year:
  • 2002
  • A statistical parser for Czech

    ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics

Quantified Score

Hi-index 0.00

Visualization

Abstract

A complex procedure of syntactic annotation of a large text corpus may be helpful in checking a rich descriptive framework (the Praguian Functional Generative Description) that makes it possible to distinguish between the core of natural language, structured in a relatively simple way, and its large periphery with indistinct borderlines. Such a procedure underlies the Prague Dependency Treebank, within which about 20 000 Czech sentences from running texts have been analyzed in their underlying structure; for 2000 sentences also their Topic-Focus structures have been specified. We illustrate the wide range of the phenomena handled, i.e. the syntactic relations proper (arguments and adjuncts), coordination, topic-focus articulation, word order, deletion, positions of focusing particles, morphological categories such as number, tense, modality, their morphemic and analytical means of expression, and so on.