Multi-objective tag SNPs selection using evolutionary algorithms

  • Authors:
  • Chuan-Kang Ting;Wei-Ting Lin;Yao-Ting Huang

  • Affiliations:
  • -;-;-

  • Venue:
  • Bioinformatics
  • Year:
  • 2010

Quantified Score

Hi-index 3.84

Visualization

Abstract

Motivation: Integrated analysis of single nucleotide polymorphisms (SNPs) and structure variations showed that the extent of linkage disequilibrium is common across different types of genetic variants. A subset of SNPs (called tag SNPs) is sufficient for capturing alleles of bi-allelic and even multi-allelic variants. However, accuracy and power of tag SNPs are affected by several factors, including genotyping failure, errors and tagging bias of certain alleles. In addition, different sets of tag SNPs should be selected for fulfilling requirements of various genotyping platforms and projects. Results: This study formulates the problem of selecting tag SNPs into a four-objective optimization problem that minimizes the total amount of tag SNPs, maximizes tolerance for missing data, enlarges and balances detection power of each allele class. To resolve this problem, we propose evolutionary algorithms incorporated with greedy initialization to find non-dominated solutions considering all objectives simultaneously. This method provides users with great flexibility to extract different sets of tag SNPs for different platforms and scenarios (e.g. up to 100 tags and 10% missing rate). Compared to conventional methods, our method explores larger search space and requires shorter convergence time. Experimental results revealed strong and weak conflicts among these objectives. In particular, a small number of additional tag SNPs can provide sufficient tolerance and balanced power given the low missing and error rates of today's genotyping platforms. Availability: The software is freely available at Bioinformatics online and http://cilab.cs.ccu.edu.tw/service_dl.html Contact:ckting@cs.ccu.edu.tw; ythuang@cs.ccu.edu.tw