REGULARIZERS FOR ESTIMATING DISTRIBUTIONS OF AMINO ACIDS FROM SMALL SAMPLES
REGULARIZERS FOR ESTIMATING DISTRIBUTIONS OF AMINO ACIDS FROM SMALL SAMPLES
Graphical models of residue coupling in protein families
Proceedings of the 5th international workshop on Bioinformatics
RECOMB'06 Proceedings of the 10th annual international conference on Research in Computational Molecular Biology
Protein Fragment Swapping: A Method for Asymmetric, Selective Site-Directed Recombination
RECOMB 2'09 Proceedings of the 13th Annual International Conference on Research in Computational Molecular Biology
Protein Design by Sampling an Undirected Graphical Model of Residue Constraints
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
RECOMB'08 Proceedings of the 12th annual international conference on Research in computational molecular biology
Using physicochemical properties of amino acids to induce graphical models of residue couplings
Proceedings of the Tenth International Workshop on Data Mining in Bioinformatics
Improved multiple sequence alignments using coupled pattern mining
Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Remote homology detection on alpha-structural proteins using simulated evolution
Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics
Improved Multiple Sequence Alignments Using Coupled Pattern Mining
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Hi-index | 0.00 |
Many statistical measures and algorithmic techniqueshave been proposed for studying residue coupling inprotein families. Generally speaking, two residue positions areconsidered coupled if, in the sequence record, some of theiramino acid type combinations are significantly more commonthan others. While the proposed approaches have proven useful infinding and describing coupling, a significant missing componentis a formal probabilistic model that explicates and compactlyrepresents the coupling, integrates information about sequence,structure, and function, and supports inferential procedures foranalysis, diagnosis, and prediction.We present an approach to learning and using probabilisticgraphical models of residue coupling. These models capturesignificant conservation and coupling constraints observable ina multiply-aligned set of sequences. Our approach can place astructural prior on considered couplings, so that all identifiedrelationships have direct mechanistic explanations. It can alsoincorporate information about functional classes, and therebylearn a differential graphical model that distinguishes constraintscommon to all classes from those unique to individual classes.Such differential models separately account for class-specificconservation and family-wide coupling, two different sourcesof sequence covariation. They are then able to perform interpretablefunctional classification of new sequences, explainingclassification decisions in terms of the underlying conservationand coupling constraints. We apply our approach in studies ofboth G protein-coupled receptors and PDZ domains, identifyingand analyzing family-wide and class-specific constraints, andperforming functional classification. The results demonstrate thatgraphical models of residue coupling provide a powerful toolfor uncovering, representing, and utilizing significant sequencestructure-function relationships in protein families.