Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes

Authors:
Kuo-Chen Chou
Affiliations:
Gordon Life Science Institute San Diego, CA 92130, USA
Venue:
Bioinformatics
Year:
2005

Citing 0
Cited 13

Using Chou's pseudo amino acid composition to predict subcellular localization of apoptosis proteins: An approach with immune genetic algorithm-based ensemble classifier

Pattern Recognition Letters
Brief communication: Efficiency analysis of KNN and minimum distance-based classifiers in enzyme family prediction

Computational Biology and Chemistry
Predicting protein subcellular locations for Gram-negative bacteria using neural networks ensemble

CIBCB'09 Proceedings of the 6th Annual IEEE conference on Computational Intelligence in Bioinformatics and Computational Biology
Coding of amino acids by texture descriptors

Artificial Intelligence in Medicine
Brief Communication: A novel feature representation method based on Chou's pseudo amino acid composition for protein structural class prediction

Computational Biology and Chemistry
Research Article: A protein fold classifier formed by fusing different modes of pseudo amino acid composition via PSSM

Computational Biology and Chemistry
Prediction of human major histocompatibility complex class II binding peptides by continuous kernel discrimination method

Artificial Intelligence in Medicine
CE-PLoc: An ensemble classifier for predicting protein subcellular locations by fusing different modes of pseudo amino acid composition

Computational Biology and Chemistry
When less is more: improving classification of protein families with a minimal set of global features

WABI'07 Proceedings of the 7th international conference on Algorithms in Bioinformatics
Multilabel Learning via Random Label Selection for Protein Subcellular Multilocations Prediction

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
MitProt-Pred: Predicting mitochondrial proteins of Plasmodium falciparum parasite using diverse physiochemical properties and ensemble classification

Computers in Biology and Medicine
Prediction of human breast and colon cancers from imbalanced data using nearest neighbor and support vector machines

Computer Methods and Programs in Biomedicine
Wavelet Analysis in Current Cancer Genome Research: A Survey

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)

Quantified Score

Hi-index	3.84

Visualization

Abstract

Motivation: With protein sequences entering into databanks at an explosive pace, the early determination of the family or subfamily class for a newly found enzyme molecule becomes important because this is directly related to the detailed information about which specific target it acts on, as well as to its catalytic process and biological function. Unfortunately, it is both time-consuming and costly to do so by experiments alone. In a previous study, the covariant-discriminant algorithm was introduced to identify the 16 subfamily classes of oxidoreductases. Although the results were quite encouraging, the entire prediction process was based on the amino acid composition alone without including any sequence-order information. Therefore, it is worthy of further investigation. Results: To incorporate the sequence-order effects into the predictor, the 'amphiphilic pseudo amino acid composition' is introduced to represent the statistical sample of a protein. The novel representation contains 20 + 2λ discrete numbers: the first 20 numbers are the components of the conventional amino acid composition; the next 2λ numbers are a set of correlation factors that reflect different hydrophobicity and hydrophilicity distribution patterns along a protein chain. Based on such a concept and formulation scheme, a new predictor is developed. It is shown by the self-consistency test, jackknife test and independent dataset tests that the success rates obtained by the new predictor are all significantly higher than those by the previous predictors. The significant enhancement in success rates also implies that the distribution of hydrophobicity and hydrophilicity of the amino acid residues along a protein chain plays a very important role to its structure and function. Contact: kchou@san.rr.com