Algorithms for joint optimization of stability and diversity in planning combinatorial libraries of chimeric proteins

Authors:
Wei Zheng;Alan M. Friedman;Chris Bailey-Kellogg
Affiliations:
Department of Computer Science, Dartmouth College, Hanover, NH;Department of Biological Sciences and Purdue Cancer Center, Purdue University, West Lafayette, IN;Department of Computer Science, Dartmouth College, Hanover, NH
Venue:
RECOMB'08 Proceedings of the 12th annual international conference on Research in computational molecular biology
Year:
2008

Citing 4
Cited 5

Mathematical Techniques for Efficient Record Segmentation in Large Shared Databases

Journal of the ACM (JACM)
Parametric Combinatorial Computing and a Problem of Program Module Distribution

Journal of the ACM (JACM)
Graphical Models of Residue Coupling in Protein Families

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Hypergraph model of multi-residue interactions in proteins: sequentially–constrained partitioning algorithms for optimization of site-directed protein recombination

RECOMB'06 Proceedings of the 10th annual international conference on Research in Computational Molecular Biology

Protein Fragment Swapping: A Method for Asymmetric, Selective Site-Directed Recombination

RECOMB 2'09 Proceedings of the 13th Annual International Conference on Research in Computational Molecular Biology
Protein Design by Sampling an Undirected Graphical Model of Residue Constraints

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Optimization of combinatorial mutagenesis

RECOMB'11 Proceedings of the 15th Annual international conference on Research in computational molecular biology
Algorithms for optimizing cross-overs in DNA shuffling

Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Structure-Guided deimmunization of therapeutic proteins

RECOMB'12 Proceedings of the 16th Annual international conference on Research in Computational Molecular Biology

Quantified Score

Hi-index	0.00

Visualization

Abstract

In engineering protein variants by constructing and screening combinatorial libraries of chimeric proteins, two complementary and competing goals are desired: the new proteins must be similar enough to the evolutionarily-selected wild-type proteins to be stably folded, and they must be different enough to display functional variation. We present here the first method, Staversity, to simultaneously optimize stability and diversity in selecting sets of breakpoint locations for site-directed recombination. Our goal is to uncover all "undominated" breakpoint sets, for which no other breakpoint set is better in both factors. Our first algorithm finds the undominated sets serving as the vertices of the lower envelope of the two-dimensional (stability and diversity) convex hull containing all possible breakpoint sets. Our second algorithm identifies additional breakpoint sets in the concavities that are either undominated or dominated only by undiscovered breakpoint sets within a distance bound computed by the algorithm. Both algorithms are efficient, requiring only time polynomial in the numbers of residues and breakpoints, while characterizing a space defined by an exponential number of possible breakpoint sets. We applied Staversity to identify 2-10 breakpoint sets for three different sets of parent proteins from the purE family of biosynthetic enzymes. The average normalized distance between our plans and the lower bound for optimal plans is around 1 percent. Our plans dominate most (60-90% on average for each parent set) of the plans found by other possible approaches, random sampling or explicit optimization for stability with implicit optimization for diversity. The identified breakpoint sets provide a compact representation of good plans, enabling a protein engineer to understand and account for the trade-offs between two key considerations in combinatorial chimeragenesis.