Mixed Integer Linear Programming for Maximum-Parsimony Phylogeny Inference

Authors:
Srinath Sridhar;Fumei Lam;Guy E. Blelloch;R. Ravi;Russell Schwartz
Affiliations:
-;-;-;-;-
Venue:
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Year:
2008

Citing 10
Cited 2

From copair hypergraphs to median graphs with latent vertices

Discrete Mathematics
A Polynomial-Time Algorithm for the Perfect Phylogeny Problem when the Number of Character States is Fixed

SIAM Journal on Computing
A Fast Algorithm for the Computation and Enumeration of Perfect Phylogenies

SIAM Journal on Computing
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
A Polynomial-Time Algorithm for Near-Perfect Phylogeny

SIAM Journal on Computing
Haplotype inference by pure Parsimony

CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Efficiently finding the most parsimonious phylogenetic tree via linear programming

ISBRA'07 Proceedings of the 3rd international conference on Bioinformatics research and applications
Fixed parameter tractability of binary near-perfect phylogenetic tree reconstruction

ICALP'06 Proceedings of the 33rd international conference on Automata, Languages and Programming - Volume Part I
A fundamental decomposition theory for phylogenetic networks and incompatible characters

RECOMB'05 Proceedings of the 9th Annual international conference on Research in Computational Molecular Biology
Simple reconstruction of binary near-perfect phylogenetic trees

ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part II

Constructing majority-rule supertrees

WABI'09 Proceedings of the 9th international conference on Algorithms in bioinformatics
Identifying rogue taxa through reduced consensus: NP-Hardness and exact algorithms

ISBRA'12 Proceedings of the 8th international conference on Bioinformatics Research and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Reconstruction of phylogenetic trees is a fundamental problem in computational biology. While excellent heuristic methods are available for many variants of this problem, new advances in phylogeny inference will be required if we are to be able to continue to make effective use of the rapidly growing stores of variation data now being gathered. In this paper, we present two integer linear programming (ILP) formulations to find the most parsimonious phylogenetic tree from a set of binary variation data. One method uses a flow-based formulation that can produce exponential numbers of variables and constraints in the worst case. The method has, however, proven extremely efficient in practice on datasets that are well beyond the reach of the available provably efficient methods, solving several large mtDNA and Y-chromosome instances within a few seconds and giving provably optimal results in times competitive with fast heuristics than cannot guarantee optimality. An alternative formulation establishes that the problem can be solved with a polynomial-sized ILP. We further present a web server developed based on the exponential-sized ILP that performs fast maximum parsimony inferences and serves as a front end to a database of precomputed phylogenies spanning the human genome.