Part-of-Speech Tagging with Evolutionary Algorithms

Authors:
Lourdes Araujo
Affiliations:
-
Venue:
CICLing '02 Proceedings of the Third International Conference on Computational Linguistics and Intelligent Text Processing
Year:
2002

Citing 7
Cited 8

C4.5: programs for machine learning

C4.5: programs for machine learning
Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging

Computational Linguistics
Genetic Algorithms Plus Data Structures Equals Evolution Programs

Genetic Algorithms Plus Data Structures Equals Evolution Programs
Evolutionary Parsing for a Probabilistic Context Free Grammar

RSCTC '00 Revised Papers from the Second International Conference on Rough Sets and Current Trends in Computing
A practical part-of-speech tagger

ANLC '92 Proceedings of the third conference on Applied natural language processing
Part-of-speech tagging using a Variable Memory Markov model

ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
Parsing the LOB corpus

ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics

Use of a genetic algorithm in brill's transformation-based part-of-speech tagger

GECCO '05 Proceedings of the 7th annual conference on Genetic and evolutionary computation
Natural language tagging with genetic algorithms

Information Processing Letters
Highly accurate error-driven method for noun phrase detection

Pattern Recognition Letters
Stochastic Parsing and Evolutionary Algorithms

Applied Artificial Intelligence
How evolutionary algorithms are applied to statistical natural language processing

Artificial Intelligence Review
Multiobjective genetic programming for natural language parsing and tagging

PPSN'06 Proceedings of the 9th international conference on Parallel Problem Solving from Nature
Statistical recognition of noun phrases in unrestricted text

IDA'05 Proceedings of the 6th international conference on Advances in Intelligent Data Analysis
A TENGRAM method based part-of-speech tagging of multi-category words in Hindi language

Expert Systems with Applications: An International Journal

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents a part-of-speech tagger based on a genetic algorithm which, after the "evolution" of a population of sequences of tags for the words in the text, selects the best individual as solution. The paper describes the main issues arising in the algorithm, such as the chromosome representation and the evaluation and design of genetic operators for crossover and mutation. A probabilistic model, based on the context of each word (the tags of the surrounding words) has been devised in order to define the fitness function. The model has been implemented and different issues have been investigated: size of the training corpus, effect of the context size, and parameters of the evolutionary algorithm, such as population size and crossover and mutation rates. The accuracy obtained with this method is comparable to that of other probabilistic approaches, but evolutionary algorithms are more efficient in obtaining the results.