ACL '99 Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics
Noun phrase recognition with tree patterns
Acta Cybernetica
Learning tree patterns for syntactic parsing
Acta Cybernetica
Learning syntactic patterns using boosting and other classifier combination schemas
TSD'05 Proceedings of the 8th international conference on Text, Speech and Dialogue
MULTEXT-East: morphosyntactic resources for Central and Eastern European languages
Language Resources and Evaluation
Dependency parsing of Hungarian: baseline results and challenges
EACL '12 Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics
Hi-index | 0.00 |
Current paper presents the results of a two-year project during which a consortium of the University of Szeged and the MorphoLogic Ltd. Budapest developed a morpho-syntactically parsed and annotated (disambiguated) corpus for Hungarian. For morpho-syntactic encoding, the Hungarian version of MSD (Morpho-Syntactic Description) has been used. The corpus contains texts of five different topic areas: schoolchildren's compositions, fiction, computer-related texts, news, and legal texts. During annotation, linguists have checked the morpho-syntactic parsing of each word. Finding part-of-speech tagging (disambiguation) rules by machine learning algorithms was also studied by the researchers of the consortium. Due to the fact that the size of the corpus reaches up to 1 million text words without punctuation characters, it may serve as a reference source for numerous future research applications. The corpus can be obtained freely via Internet for research and educational purposes.