Combining Classifiers to Improve Part of Speech Tagging: A Case Study for Brazilian Portuguese
International Joint Conference, 7th Ibero-American Conference, 15th Brazilian Symposium on AI, IBERAMIA-SBIA 2000, Open Discussion Track Proceedings on AI
A large portuguese corpus on-line: cleaning and preprocessing
PROPOR'12 Proceedings of the 10th international conference on Computational Processing of the Portuguese Language
Extracting definitions from brazilian legal texts
ICCSA'12 Proceedings of the 12th international conference on Computational Science and Its Applications - Volume Part III
Hi-index | 0.00 |
This article identifies and addresses the major linguistic/conceptual, as opposed to logistic, issues faced in the morphosyntactic tagging of MAC-Morpho, a 1.1 million word Brazilian Portuguese corpus of newspaper articles that has been developed in the Lacio-Web Project. Rather than simply presenting the annotated corpus and describing its tagset, we elaborate on the criteria for establishing the tagset and analyze some interesting cases amongst the linguistic problems we faced in this work.