Mining sequential patterns for protein fold recognition
Journal of Biomedical Informatics
Computational Biology and Chemistry
g-MARS: Protein Classification Using Gapped Markov Chains and Support Vector Machines
PRIB '08 Proceedings of the Third IAPR International Conference on Pattern Recognition in Bioinformatics
Application of residue distribution along the sequence for discriminating outer membrane proteins
Computational Biology and Chemistry
Protein fold recognition based upon the amino acid occurrence
PRIB'07 Proceedings of the 2nd IAPR international conference on Pattern recognition in bioinformatics
PRIB'07 Proceedings of the 2nd IAPR international conference on Pattern recognition in bioinformatics
Classifying proteins using gapped Markov feature pairs
Neurocomputing
Topology prediction of α-helical and β-barrel transmembrane proteins using RBF networks
ICIC'10 Proceedings of the 6th international conference on Advanced intelligent computing theories and applications: intelligent computing
International Journal of Bioinformatics Research and Applications
Hi-index | 3.84 |
Motivation: Discriminating outer membrane proteins from other folding types of globular and membrane proteins is an important task both for identifying outer membrane proteins from genomic sequences and for the successful prediction of their secondary and tertiary structures. Results: We have systematically analyzed the amino acid composition of globular proteins from different structural classes and outer membrane proteins. We found that the residues, Glu, His, Ile, Cys, Gln, Asn and Ser, show a significant difference between globular and outer membrane proteins. Based on this information, we have devised a statistical method for discriminating outer membrane proteins from other globular and membrane proteins. Our approach correctly picked up the outer membrane proteins with an accuracy of 89% for the training set of 337 proteins. On the other hand, our method has correctly excluded the globular proteins at an accuracy of 79% in a non-redundant dataset of 674 proteins. Furthermore, the present method is able to correctly exclude α-helical membrane proteins up to an accuracy of 80%. These accuracy levels are comparable to other methods in the literature, and this is a simple method, which could be used for dissecting outer membrane proteins from genomic sequences. The influence of protein size, structural class and specific residues for discrimination is discussed. Availability: A program for the discrimination method is available upon request from the corresponding author. The datasets used in this work are available at http://www.cbrc.jp/~gromiha/omp/dataset.html Contact: michael-gromiha@aist.go.jp