Reducing bias and inefficiency in the selection algorithm
Proceedings of the Second International Conference on Genetic Algorithms on Genetic algorithms and their application
Computers in Biology and Medicine
The equation for response to selection and its use for prediction
Evolutionary Computation
A review of feature selection techniques in bioinformatics
Bioinformatics
2SNP: Scalable Phasing Method for Trios and Unrelated Individuals
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
BICoB '09 Proceedings of the 1st International Conference on Bioinformatics and Computational Biology
Implicit elitism in genetic search
ICONIP'06 Proceedings of the 13th international conference on Neural information processing - Volume Part III
Combinatorial methods for disease association search and susceptibility prediction
WABI'06 Proceedings of the 6th international conference on Algorithms in Bioinformatics
Dimensionality reduction using genetic algorithms
IEEE Transactions on Evolutionary Computation
Hi-index | 0.00 |
Crohn's disease is an inflammatory bowel disease. Because of strong heritability, it is possible to deploy the pattern of DNA variations, such as single nucleotide polymorphisms (SNPs), to accurately predict the state of this disease. However, there are many possible SNP subsets, which make finding a best set of SNPs to achieve the highest prediction accuracy impossible in one patient's lifetime. In this paper, a new technique is proposed that relies on chromosomes of various lengths with significant order feature selection, a new cross-over approach, and new mutation operations. Our method can find a chromosome of appropriate length with useful features. The Crohn's disease data that were gathered from case-control association studies were used to demonstrate the effectiveness of our proposed algorithm. In terms of the prediction accuracy, the proposed SNP prediction framework outperformed previously proposed techniques, including the optimum random forest (ORF), the univariate marginal distribution algorithm and support vector machine (USVM), the complimentary greedy search-based prediction algorithm (CGSP), the combinatorial search-based prediction algorithm (CSP), and discretized network flow (DNF). The performance of our framework, when tested against this real data set with a 5-fold cross-validation, was 90.4% accuracy with 87.5% sensitivity and 92.2% specificity.