Studying the advantages of a messy evolutionary algorithm for natural language tagging

Authors:
Lourdes Araujo
Affiliations:
Dpto. Sistemas Informáticos y Programación, Universidad Complutense de Madrid
Venue:
GECCO'03 Proceedings of the 2003 international conference on Genetic and evolutionary computation: PartII
Year:
2003

Citing 8
Cited 0

Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
Search, polynomial complexity, and the fast messy genetic algorithm

Search, polynomial complexity, and the fast messy genetic algorithm
Statistical Language Learning

Statistical Language Learning
RapidAccurate Optimization of Difficult Problems Using Fast Messy Genetic Algorithms

Proceedings of the 5th International Conference on Genetic Algorithms
A Parallel Evolutionary Algorithm for Stochastic Natural Language Parsing

PPSN VII Proceedings of the 7th International Conference on Parallel Problem Solving from Nature
Tagging English text with a probabilistic model

Computational Linguistics
A practical part-of-speech tagger

ANLC '92 Proceedings of the third conference on Applied natural language processing
Part-of-speech tagging using a Variable Memory Markov model

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics

Quantified Score

Hi-index	0.00

Visualization

Abstract

The process of labeling each word in a sentence with one of its lexical categories (noun, verb, etc) is called tagging and is a key step in parsing and many other language processing and generation applications. Automatic lexical taggers are usually based on statistical methods, such as Hidden Markov Models, which works with information extracted from large tagged available corpora. This information consists of the frequencies of the contexts of the words, that is, of the sequence of their neighbouring tags. Thus, these methods rely on the assumption that the tag of a word only depends on its surrounding tags. This work proposes the use of a Messy Evolutionary Algorithm to investigate the validity of this assumption. This algorithm is an extension of the fast messy genetic algorithms, a variety of Genetic Algorithms that improve the survival of high quality partial solutions or building blocks. Messy GAs do not require all genes to be present in the chromosomes and they may also appear more than one time. This allows us to study the kind of building blocks that arise, thus obtaining information of possible relationships between the tag of a word and other tags corresponding to any position in the sentence. The paper describes the design of a messy evolutionary algorithm for the tagging problem and a number of experiments on the performance of the system and the parameters of the algorithm.