C4.5: programs for machine learning
C4.5: programs for machine learning
Constraint Grammar: A Language-Independent System for Parsing Unrestricted Text
Constraint Grammar: A Language-Independent System for Parsing Unrestricted Text
Serial combination of rules and statistics: a case study in Czech tagging
ACL '01 Proceedings of the 39th Annual Meeting on Association for Computational Linguistics
Intelligent Information Processing and Web Mining: Proceedings of the International IIS: IIPWM'06 Conference held in Ustron, Poland, June 19-22, 2006 (Advances in Soft Computing)
Application of syntactic properties to three-level recognition of polish hand-written medical texts
Proceedings of the 2006 ACM symposium on Document engineering
Semantic similarity measure of polish nouns based on linguistic features
BIS'07 Proceedings of the 10th international conference on Business information systems
Automatic selection of heterogeneous syntactic features in semantic similarity of polish nouns
TSD'07 Proceedings of the 10th international conference on Text, speech and dialogue
A morphosyntactic Brill Tagger for inflectional languages
IceTAL'10 Proceedings of the 7th international conference on Advances in natural language processing
Towards the adequate evaluation of morphosyntactic taggers
COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics: Posters
Language modelling for the needs of OCR of medical texts
ISBMDA'06 Proceedings of the 7th international conference on Biological and Medical Data Analysis
Hi-index | 0.00 |
The large tagset of the IPI PAN Corpus of Polish and the limited size of the learning corpus make construction of a tagger especially demanding The goal of this work is to decompose the overall process of tagging of Polish into subproblems of partial disambiguation Moreover, an architecture of a tagger facilitating this decomposition is proposed The proposed architecture enables easy integration of hand-written tagging rules with the rest of the tagger The architecture is open for different types of classifiers A complete tagger for Polish called TaKIPI is also presented Its configuration, the achieved results (92.55% of accuracy for all tokens, 84.75% for ambiguous tokens in ten-fold test), and considered variants of the architecture are discussed, too.