The szeged treebank

Authors:
Dóra Csendes;János Csirik;Tibor Gyimóthy;András Kocsor
Affiliations:
Department of Informatics, University of Szeged, Szeged, Hungary;Department of Informatics, University of Szeged, Szeged, Hungary;Department of Informatics, University of Szeged, Szeged, Hungary;MTA-SZTE Research Group on Artificial Intelligence, Szeged, Hungary
Venue:
TSD'05 Proceedings of the 8th international conference on Text, Speech and Dialogue
Year:
2005

Citing 4
Cited 4

Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Dependency treebank for Russian: concept, tools, types of information

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Development of corpora within the CLaRK system: the BulTreeBank project experience

EACL '03 Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2
The hinoki treebank a treebank for text understanding

IJCNLP'04 Proceedings of the First international joint conference on Natural Language Processing

Clustering Hungarian verbs on the basis of complementation patterns

ACL '07 Proceedings of the 45th Annual Meeting of the ACL: Student Research Workshop
Hungarian corpus of light verb constructions

COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
Hungarian-English machine translation using genpar

TSD'06 Proceedings of the 9th international conference on Text, Speech and Dialogue
Dependency parsing of Hungarian: baseline results and challenges

EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

The major aim of the Szeged Treebank project was to create a high-quality database of syntactic structures for Hungarian that can serve as a golden standard to further research in linguistics and computational language processing. The treebank currently contains full syntactic parsing of about 82,000 sentences, which is the result of accurate manual annotation. Current paper describes the linguistic theory as well as the actual method used in the annotation process. In addition, the application of the treebank for the training of automated syntactic parsers is also presented.