Protein CorreLogo: an X3D representation of co-evolving pairs, tertiary structure, ligand binding pockets and protein-protein interactions in protein families

  • Authors:
  • Scooter Willis

  • Affiliations:
  • University of Florida

  • Venue:
  • Proceedings of the twelfth international conference on 3D web technology
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

To understand the functional elements of a protein structure biologists use domain specific 3D viewers (PDB) that are written to process the coordinates of atoms that represent the solved protein structure using X-Ray crystallography or NMR. The PDB viewers have been written to capture specific or common features of interest to the researcher. With the explosion of protein sequence data comparative studies and statistical analysis of data can indicate regions of interest in 3D models. The ability to integrate statistical data into existing PDB viewers is difficult because the software is typically written to accomplish very specific functional goals and does not support exporting to a standard 3D format. In this paper, the PDB data is shown as X3D PDB ribbon models that are augmented with statistically significant data and compared to an Information-Rich Virtual Environment represented as a Protein CorreLogo X3D model. A protein family (Pfam) represents multiple alignments of protein sequences where protein domains and the tertiary structures have evolutionary conserved regions representing protein function. Various information properties of the protein family, the tertiary structure from a sequence's PDB structure and ligand binding pockets are combined to create a 3D Protein CorreLogo model. The multiple sequence alignment from the protein family is used to detect co-evolving amino acid pairs using mutual information. Co-evolving pairs are indicated as a column with color coding to represent the physio-chemical properties of each co-evolving amino acid combination. Additional visualizations along each axis include the 2D sequence logo, the degree of insert regions in the protein family and the surface accessibility of each amino acid for the referenced PDB sequence. The Protein CorreLogo model is based on X3D (VRML) facilitating immersive viewing of complex data relationships and detected co-evolving pairs. Two protein families are presented in the results section that compare the Protein CorreLogo model with a representative X3D RDB ribbon model showing the structural significance of predicted co-evolving amino acid pairs using mutual information. One example protein family, with proteins that bind cyclic nucleotides (PF00027.18), is given where the co-evolving pairs are potential markers for ligand binding pocket regions. Another example protein family, with SH3 domains that are involved in signal transduction related to cytoskeletal organization (PF00018.16), shows significant mutual information occurring between two pairs of amino acids that are in contact in the intertwined dimer structure but are on opposite ends of the tertiary structure. Protein CorreLogo X3D models and X3D PDB ribbon models can be found at http://www.proteinx3d.com