Using an ant colony metaheuristic to optimize automatic word segmentation for ancient Greek

  • Authors:
  • George Tambouratzis

  • Affiliations:
  • Institute for Language and Speech Processing, Athens, Greece

  • Venue:
  • IEEE Transactions on Evolutionary Computation
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Given a text or collection of texts involving unconstrained language, a basic task in a multitude of applications is the identification of stems and endings for each word form, which is termed morphological analysis. In this paper, the use of an ant colony optimization (ACO) metaheuristic is proposed for a linguistic task that involves the automated morphological segmentation of Ancient Greek word forms into stem and ending. The task of morphological analysis is essential for implementing text-processing applications such as semantic analysis and information retrieval. The difficulty of the morphological analysis task differs depending on the language chosen, being hardest in the case of highly-inflectional languages, where each stem may be associated with a large number of different endings. In this paper, focus is placed on the morphological analysis of Ancient Greek, which has been shown to be a particularly hard task. To perform this task, a system for the automated morphological processing has been proposed, which implements the morphological analysis of words by coupling an iterative pattern-recognition algorithm with a modest amount of linguistic knowledge, expressed via a set of interactions associated with weights. In an earlier version of the system, these weights were determined by combining the input from specialized scientists with a lengthy manual optimization process. In this paper, the ACO metaheuristic is applied to the task of defining near-optimal system weights using an automated process based on a set of training data. The experiments performed indicate that the segmentation quality achieved by ACO is equivalent to or in several cases substantially higher than that achieved using manually optimized weights.