Mining quantitative association rules in protein sequences

  • Authors:
  • Nitin Gupta;Nitin Mangal;Kamal Tiwari;Pabitra Mitra

  • Affiliations:
  • Bioinformatics Group, Dept. of Computer Science, University of California, San Diego, La Jolla, CA;Department of Computer Science and Engineering, Indian Institute of Technology, Kanpur, India;Bioinformatics Group, Dept. of Computer Science, University of California, San Diego, La Jolla, CA;Department of Computer Science and Engineering, Indian Institute of Technology, Kanpur, India

  • Venue:
  • Data Mining
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Lot of research has gone into understanding the composition and nature of proteins, still many things remain to be understood satisfactorily. It is now generally believed that amino acid sequences of proteins are not random, and thus the patterns of amino acids that we observe in the protein sequences are also non-random. In this study, we have attempted to decipher the nature of associations between different amino acids that are present in a protein. This very basic analysis provides insights into the co-occurrence of certain amino acids in a protein. Such association rules are desirable for enhancing our understanding of protein composition and hold the potential to give clues regarding the global interactions amongst some particular sets of amino acids occuring in proteins. Presence of strong non-trivial associations suggests further evidence for non-randomness of protein sequences. Knowledge of these rules or constraints is highly desirable for the in-vitro synthesis of artificial proteins.