A statistical approach to machine translation
Computational Linguistics
Class-based n-gram models of natural language
Computational Linguistics
Head-driven statistical models for natural language parsing
Head-driven statistical models for natural language parsing
A Reversible Automata Approach to Modeling Birdsongs
CIC '06 Proceedings of the 15th International Conference on Computing
Automated alphabet reduction method with evolutionary algorithms for protein structure prediction
Proceedings of the 9th annual conference on Genetic and evolutionary computation
Hi-index | 0.00 |
Markov chain classification or n-gram modeling, as it is sometimes called, is a very common and powerful tool for many problems that involve sequences of finite tokens. It has been used in a wide range of tasks, including natural language modeling, author identification, protein similarity searches, and even bird-song recognition. Clearly, an improvement in the Markov chain classification will have broad implications in many fields. Our new system, called SCS, improves upon Markov chain classification by introducing a preprocessing step in which an arbitrary set of transformation functions are performed on the input sequences. Since the space of possible transformations is unbounded, a genetic algorithm search is used to search for functions that improve classification. We show that GA is able to consistently find preprocessing functions that substantially improve the performance of the Markov chain model.