Journal of Biomedical Informatics - Special issue: Phylogenetic inferencing: Beyond biology
Greedy closure evolutionary algorithms
CEC '02 Proceedings of the Evolutionary Computation on 2002. CEC '02. Proceedings of the 2002 Congress - Volume 02
Evolutionary Computation for Modeling and Optimization
Evolutionary Computation for Modeling and Optimization
Hi-index | 0.00 |
This study presents an evolutionary algorithm for locating DNA sequence characters that are diagnostic between closely related groups of species. The algorithm is developed using synthetic data and then tested on biological data from a species of butterfly recently discovered to be a cryptic complex of species. This technique proved to be successful in locating positions that are diagnostic of the cryptic neotropical skipper butterfly species within the cytochrome c oxidase subunit I (COI) DNA barcode data. The algorithm uses a novel subset representation to select positions within the DNA sequences. A crossover operator that takes pairs of subsets to pairs of subsets is designed. This crossover operator permits the use of a novel mutation operator that disrupts loci showing evidence of convergence, yielding better preservation of diversity in the evolving population of diagnostic character positions. A lexical (tie breaking) fitness function is used to smooth the fitness landscape. The problem of locating diagnostic positions in DNA sequences proved difficult without lexical fitness; with that innovation in place the problem is quite tractable. The evolutionary algorithm developed has the potential for broad application such as in conservation, customs enforcement, and forensics.