Learning Word Segmentation Rules for Tag Prediction

  • Authors:
  • Dimitar Kazakov;Suresh Manandhar;Tomaz Erjavec

  • Affiliations:
  • -;-;-

  • Venue:
  • ILP '99 Proceedings of the 9th International Workshop on Inductive Logic Programming
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

In our previous work we introduced a hybrid, GA&ILP-based approach for learning of stem-suffix segmentation rules from an unmarked list of words. Evaluation of the method was made diffcult by the lack of word corpora annotated with their morphological segmentation. Here the hybrid approach is evaluated indirectly, on the task of tag prediction. A pair of stem-tag and suffix-tag lexicons is obtained by the application of that approach to an annotated lexicon of word-tag pairs. The two lexicons are then used to predict the tags of unseen words in two ways, (1) by using only the stem and suffix generated by the segmentation rules, and (2) for all matching combinations of stem and suffix present in the lexicons. The results show high correlation between the constituents generated by the segmentation rules, and the tags of the words in which they appear, thereby demonstrating the linguistic relevance of the segmentations produced by the hybrid approach.