Building a large annotated corpus of English: the penn treebank
Computational Linguistics - Special issue on using large corpora: II
MULTEXT: Multilingual Text Tools and Corpora
COLING '94 Proceedings of the 15th conference on Computational linguistics - Volume 1
The MULTEXT-east morphosyntactic specifications for Slavic languages
MorphSlav '03 Proceedings of the 2003 EACL Workshop on Morphological Processing of Slavic Languages
ISOcat: remodelling metadata for language resources
International Journal of Metadata, Semantics and Ontologies
Persian in MULTEXT-East framework
FinTAL'06 Proceedings of the 5th international conference on Advances in Natural Language Processing
The semantic gap of formalized meaning
ESWC'10 Proceedings of the 7th international conference on The Semantic Web: research and Applications - Volume Part II
MULTEXT-East: morphosyntactic resources for Central and Eastern European languages
Language Resources and Evaluation
Hi-index | 0.01 |
This paper describes the modeling of the morphosyntactic annotations of the MULTEXT-East corpora and lexicons as an OWL/DL ontology. Formalizing annotation schemes in OWL/DL has the advantages of enabling formally specifying interrelationships between the various features and making logical inferences based on the relationships between them. We show that this approach provides us with a top-down perspective on a large set of morphosyntactic specifications for multiple languages, and that this perspective helps to identify and to resolve conceptual problems in the original specifications. Furthermore, the ontological modeling allows us to link the MULTEXT-East specifications with repositories of annotation terminology such as the General Ontology of Linguistics Descriptions or the ISO TC37/SC4 Data Category Registry.