Natural language tagging with genetic algorithms

Authors:
Enrique Alba;Gabriel Luque;Lourdes Araujo
Affiliations:
Departamento de Lenguajes y Ciencias de la Computación, Universidad de Málaga, Spain;Departamento de Lenguajes y Ciencias de la Computación, Universidad de Málaga, Spain;Departamento Sistemas Informáticos y Programación, Universidad Complutense, Spain
Venue:
Information Processing Letters
Year:
2006

Citing 12
Cited 7

Grammatical category disambiguation by statistical optimization

Computational Linguistics
Foundations of statistical natural language processing

Foundations of statistical natural language processing
Statistical Language Learning

Statistical Language Learning
Modern Information Retrieval

Modern Information Retrieval
Part-of-Speech Tagging with Evolutionary Algorithms

CICLing '02 Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing
Memory-based shallow parsing

The Journal of Machine Learning Research
Building a large annotated corpus of English: the penn treebank

Computational Linguistics - Special issue on using large corpora: II
Improving accuracy in word class tagging through the combination of machine learning systems

Computational Linguistics
TnT: a statistical part-of-speech tagger

ANLC '00 Proceedings of the sixth conference on Applied natural language processing
Improving part-of-speech tagging using lexicalized HMMs

Natural Language Engineering
Tagging and chunking with bigrams

COLING '00 Proceedings of the 18th conference on Computational linguistics - Volume 2
Parallel Metaheuristics: A New Class of Algorithms

Parallel Metaheuristics: A New Class of Algorithms

Feasibility-preserving crossover for maximum k-coverage problem

Proceedings of the 10th annual conference on Genetic and evolutionary computation
How evolutionary algorithms are applied to statistical natural language processing

Artificial Intelligence Review
Intelligent steganalytic system: application on natural language environment

WSEAS Transactions on Systems and Control
Evolutionary undersampling for classification with imbalanced datasets: Proposals and taxonomy

Evolutionary Computation
Classifier Ensemble Selection Using Genetic Algorithm for Named Entity Recognition

Research on Language and Computation
Evolutionary Shallow Natural Language Parsing

Computational Intelligence
Optimal weight tuning method for unit selection cost functions in syllable based text-to-speech synthesis

Applied Soft Computing

Quantified Score

Hi-index	0.89

Visualization

Abstract

This work analyzes the relative advantages of different metaheuristic approaches to the well-known natural language processing problem of part-of-speech tagging. This consists of assigning to each word of a text its disambiguated part-of-speech according to the context in which the word is used. We have applied a classic genetic algorithm (GA), a CHC algorithm, and a simulated annealing (SA). Different ways of encoding the solutions to the problem (integer and binary) have been studied, as well as the impact of using parallelism for each of the considered methods. We have performed experiments on different linguistic corpora and compared the results obtained against other popular approaches plus a classic dynamic programming algorithm. Our results claim for the high performances achieved by the parallel algorithms compared to the sequential ones, and state the singular advantages for every technique. Our algorithms and some of its components can be used to represent a new set of state-of-the-art procedures for complex tagging scenarios.