Discovery of underlying morphological relations using an agglomerative clustering algorithm

  • Authors:
  • Zacharias Detorakis;George Tambouratzis

  • Affiliations:
  • Inst. for Language and Speech Processing, Paradissos Amaroussiou, Greece;Inst. for Language and Speech Processing, Paradissos Amaroussiou, Greece

  • Venue:
  • CSTST '08 Proceedings of the 5th international conference on Soft computing as transdisciplinary science and technology
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a hierarchical clustering algorithm aimed at creating groups of stems with similar characteristics. The resulting groups (clusters) are expected to comprise stems belonging to the same inflectional paradigm (e.g. verbs in passive voice) which will aid the creation of a morphological lexicon. A new metric for calculating the distance between the data objects is proposed, that better suits the specific application by addressing problems that may occur due to the limited amount of information from the data. A series of experimental results are also provided, that demonstrate the performance of the algorithm, compare different distance metrics in terms of their effectiveness and assist in choosing appropriate approaches for a number of parameters.