C4.5: programs for machine learning
C4.5: programs for machine learning
Genetic Algorithms Plus Data Structures Equals Evolution Programs
Genetic Algorithms Plus Data Structures Equals Evolution Programs
Evolutionary Parsing for a Probabilistic Context Free Grammar
RSCTC '00 Revised Papers from the Second International Conference on Rough Sets and Current Trends in Computing
A practical part-of-speech tagger
ANLC '92 Proceedings of the third conference on Applied natural language processing
Part-of-speech tagging using a Variable Memory Markov model
ACL '94 Proceedings of the 32nd annual meeting on Association for Computational Linguistics
ACL '90 Proceedings of the 28th annual meeting on Association for Computational Linguistics
Use of a genetic algorithm in brill's transformation-based part-of-speech tagger
GECCO '05 Proceedings of the 7th annual conference on Genetic and evolutionary computation
Natural language tagging with genetic algorithms
Information Processing Letters
Highly accurate error-driven method for noun phrase detection
Pattern Recognition Letters
Stochastic Parsing and Evolutionary Algorithms
Applied Artificial Intelligence
How evolutionary algorithms are applied to statistical natural language processing
Artificial Intelligence Review
Multiobjective genetic programming for natural language parsing and tagging
PPSN'06 Proceedings of the 9th international conference on Parallel Problem Solving from Nature
Statistical recognition of noun phrases in unrestricted text
IDA'05 Proceedings of the 6th international conference on Advances in Intelligent Data Analysis
A TENGRAM method based part-of-speech tagging of multi-category words in Hindi language
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
This paper presents a part-of-speech tagger based on a genetic algorithm which, after the "evolution" of a population of sequences of tags for the words in the text, selects the best individual as solution. The paper describes the main issues arising in the algorithm, such as the chromosome representation and the evaluation and design of genetic operators for crossover and mutation. A probabilistic model, based on the context of each word (the tags of the surrounding words) has been devised in order to define the fitness function. The model has been implemented and different issues have been investigated: size of the training corpus, effect of the context size, and parameters of the evolutionary algorithm, such as population size and crossover and mutation rates. The accuracy obtained with this method is comparable to that of other probabilistic approaches, but evolutionary algorithms are more efficient in obtaining the results.