New Optimization Model and Algorithm for Sibling Reconstruction from Genetic Markers

  • Authors:
  • W. Art Chaovalitwongse;Chun-An Chou;Tanya Y. Berger-Wolf;Bhaskar DasGupta;Saad Sheikh;Mary V. Ashley;Isabel C. Caballero

  • Affiliations:
  • Department of Industrial and Systems Engineering, Rutgers University, Piscataway, New Jersey 08854;Department of Industrial and Systems Engineering, Rutgers University, Piscataway, New Jersey 08854;Department of Computer Science, University of Illinois, Chicago, Illinois 60607;Department of Computer Science, University of Illinois, Chicago, Illinois 60607;Department of Computer Science, University of Illinois, Chicago, Illinois 60607;Department of Biological Sciences, University of Illinois, Chicago, Illinois 60607;Department of Biological Sciences, University of Illinois, Chicago, Illinois 60607

  • Venue:
  • INFORMS Journal on Computing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

With improved tools for collecting genetic data from natural and experimental populations, new opportunities arise to study fundamental biological processes, including behavior, mating systems, adaptive trait evolution, and dispersal patterns. Full use of the newly available genetic data often depends upon reconstructing genealogical relationships of individual organisms, such as sibling reconstruction. This paper presents a new optimization framework for sibling reconstruction from single generation microsatellite genetic data. Our framework is based on assumptions of parsimony and combinatorial concepts of Mendel's inheritance rules. Here, we develop a novel optimization model for sibling reconstruction as a large-scale mixed-integer program (MIP), shown to be a generalization of the set covering problem. We propose a new heuristic approach to efficiently solve this large-scale optimization problem. We test our approach on real biological data as presented in other studies as well as simulated data, and compare our results with other state-of-the-art sibling reconstruction methods. The empirical results show that our approaches are very efficient and outperform other methods while providing the most accurate solutions for two benchmark data sets. The results suggest that our framework can be used as an analytical and computational tool for biologists to better study ecological and evolutionary processes involving knowledge of familial relationships in a wide variety of biological systems.