Estimating population size via line graph reconstruction

  • Authors:
  • Bjarni V. Halldórsson;Dima Blokh;Roded Sharan

  • Affiliations:
  • School of Science and Engineering, Reykjavík University, Reykjavik, Iceland;Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel;Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel

  • Venue:
  • WABI'12 Proceedings of the 12th international conference on Algorithms in Bioinformatics
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a novel graph theoretic method to estimate haplotype population size from genotype data. The method considers only the potential sharing of haplotypes between individuals and is based on transforming the graph of potential haplotype sharing into a line graph using a minimum number of edge and vertex deletions. We show that the problems are NP complete and provide exact integer programming solutions for them. We test our approach using extensive simulations of multiple population evolution and genotypes sampling scenarios. Our computational experiments show that when most of the sharings are true sharings the problem can be solved very fast and the estimated size is very close to the true size; when many of the potential sharings do not stem from true haplotype sharing, our method gives reasonable lower bounds on the underlying number of haplotypes. In comparison, a naive approach of phasing the input genotypes provides trivial upper bounds of twice the number of genotypes.