C4.5: programs for machine learning
C4.5: programs for machine learning
Some advances in transformation-based part of speech tagging
AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
Learning logic programs by using the product homomorphism method
COLT '97 Proceedings of the tenth annual conference on Computational learning theory
Local Search in Combinatorial Optimization
Local Search in Combinatorial Optimization
Learning semantic functions of attribute grammars
Nordic Journal of Computing
Inductive Learning in Deductive Databases
IEEE Transactions on Knowledge and Data Engineering
Part-of-Speech Tagging Using Progol
ILP '97 Proceedings of the 7th International Workshop on Inductive Logic Programming
Term Comparisons in First-Order Similarity Measures
ILP '98 Proceedings of the 8th International Workshop on Inductive Logic Programming
Using Prior Probabilities and Density Estimation for Relational Classification
ILP '98 Proceedings of the 8th International Workshop on Inductive Logic Programming
Induction of Constraint Grammar-Rules Using Progol
ILP '98 Proceedings of the 8th International Workshop on Inductive Logic Programming
Improving data driven wordclass tagging by system combination
COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1
Solving Selection Problems Using Preference Relation Based on Bayesian Learning
ILP '00 Proceedings of the 10th International Conference on Inductive Logic Programming
Mining closed patterns in relational, graph and network data
Annals of Mathematics and Artificial Intelligence
Hi-index | 0.00 |
From the point of view of computational linguistics, Hungarian is a diffcult language due to its complex grammar and rich morphology. This means that even a common task such as part-of-speech tagging presents a new challenge for learning when looked at for the Hungarian language, especially given the fact that this language has fairly free word order. In this paper we therefore present a case study designed to illustrate the potential and limits of current ILP and non-ILP algorithms on the Hungarian POS-tagging task. We have selected the popular C4.5 and Progol systems as propositional and ILP representatives, adding experiments with our own methods AGLEARN, a C4.5 preprocessor based on attribute grammars, and the ILP approaches PHM and RIBL. The systems were compared on the Hungarian version of the multilingual morphosyntactically annotated MULTEXT-East TELRI corpus which consists of about 100.000 tokens. Experimental results indicate that Hungarian POS-tagging is indeed a challenging task for learning algorithms, that even simple background knowledge leads to large differences in accuracy, and that instance-based methods are promising approaches to POS tagging also for Hungarian. The paper also includes experiments with some different cascade connections of the taggers.