Application of Different Learning Methods to Hungarian Part-of-Speech Tagging

Authors:
Tamás Horváth;Zoltán Alexin;Tibor Gyimothy;Stefan Wrobel
Affiliations:
-;-;-;-
Venue:
ILP '99 Proceedings of the 9th International Workshop on Inductive Logic Programming
Year:
1999

Citing 11
Cited 2

C4.5: programs for machine learning

C4.5: programs for machine learning
Some advances in transformation-based part of speech tagging

AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
Learning logic programs by using the product homomorphism method

COLT '97 Proceedings of the tenth annual conference on Computational learning theory
Local Search in Combinatorial Optimization

Local Search in Combinatorial Optimization
Learning semantic functions of attribute grammars

Nordic Journal of Computing
Inductive Learning in Deductive Databases

IEEE Transactions on Knowledge and Data Engineering
Part-of-Speech Tagging Using Progol

ILP '97 Proceedings of the 7th International Workshop on Inductive Logic Programming
Term Comparisons in First-Order Similarity Measures

ILP '98 Proceedings of the 8th International Workshop on Inductive Logic Programming
Using Prior Probabilities and Density Estimation for Relational Classification

ILP '98 Proceedings of the 8th International Workshop on Inductive Logic Programming
Induction of Constraint Grammar-Rules Using Progol

ILP '98 Proceedings of the 8th International Workshop on Inductive Logic Programming
Improving data driven wordclass tagging by system combination

COLING '98 Proceedings of the 17th international conference on Computational linguistics - Volume 1

Solving Selection Problems Using Preference Relation Based on Bayesian Learning

ILP '00 Proceedings of the 10th International Conference on Inductive Logic Programming
Mining closed patterns in relational, graph and network data

Annals of Mathematics and Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

From the point of view of computational linguistics, Hungarian is a diffcult language due to its complex grammar and rich morphology. This means that even a common task such as part-of-speech tagging presents a new challenge for learning when looked at for the Hungarian language, especially given the fact that this language has fairly free word order. In this paper we therefore present a case study designed to illustrate the potential and limits of current ILP and non-ILP algorithms on the Hungarian POS-tagging task. We have selected the popular C4.5 and Progol systems as propositional and ILP representatives, adding experiments with our own methods AGLEARN, a C4.5 preprocessor based on attribute grammars, and the ILP approaches PHM and RIBL. The systems were compared on the Hungarian version of the multilingual morphosyntactically annotated MULTEXT-East TELRI corpus which consists of about 100.000 tokens. Experimental results indicate that Hungarian POS-tagging is indeed a challenging task for learning algorithms, that even simple background knowledge leads to large differences in accuracy, and that instance-based methods are promising approaches to POS tagging also for Hungarian. The paper also includes experiments with some different cascade connections of the taggers.