Natural language tagging with genetic algorithms

  • Authors:
  • Enrique Alba;Gabriel Luque;Lourdes Araujo

  • Affiliations:
  • Departamento de Lenguajes y Ciencias de la Computación, Universidad de Málaga, Spain;Departamento de Lenguajes y Ciencias de la Computación, Universidad de Málaga, Spain;Departamento Sistemas Informáticos y Programación, Universidad Complutense, Spain

  • Venue:
  • Information Processing Letters
  • Year:
  • 2006

Quantified Score

Hi-index 0.89

Visualization

Abstract

This work analyzes the relative advantages of different metaheuristic approaches to the well-known natural language processing problem of part-of-speech tagging. This consists of assigning to each word of a text its disambiguated part-of-speech according to the context in which the word is used. We have applied a classic genetic algorithm (GA), a CHC algorithm, and a simulated annealing (SA). Different ways of encoding the solutions to the problem (integer and binary) have been studied, as well as the impact of using parallelism for each of the considered methods. We have performed experiments on different linguistic corpora and compared the results obtained against other popular approaches plus a classic dynamic programming algorithm. Our results claim for the high performances achieved by the parallel algorithms compared to the sequential ones, and state the singular advantages for every technique. Our algorithms and some of its components can be used to represent a new set of state-of-the-art procedures for complex tagging scenarios.