The simple core and the complex periphery of natural language a formal and a computational view

Authors:
Petr Sgall;Alena Böhmová
Affiliations:
Charles University Prague, Czech Rep.;Charles University Prague, Czech Rep.
Venue:
COLING '02 Proceedings of the 19th international conference on Computational linguistics - Volume 1
Year:
2002

Citing 1
Cited 0

A statistical parser for Czech

ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

A complex procedure of syntactic annotation of a large text corpus may be helpful in checking a rich descriptive framework (the Praguian Functional Generative Description) that makes it possible to distinguish between the core of natural language, structured in a relatively simple way, and its large periphery with indistinct borderlines. Such a procedure underlies the Prague Dependency Treebank, within which about 20 000 Czech sentences from running texts have been analyzed in their underlying structure; for 2000 sentences also their Topic-Focus structures have been specified. We illustrate the wide range of the phenomena handled, i.e. the syntactic relations proper (arguments and adjuncts), coordination, topic-focus articulation, word order, deletion, positions of focusing particles, morphological categories such as number, tense, modality, their morphemic and analytical means of expression, and so on.