Rough-Fuzzy C-Medoids Algorithm and Selection of Bio-Basis for Amino Acid Sequence Analysis
IEEE Transactions on Knowledge and Data Engineering
Computational Biology and Chemistry
Protein sequence analysis using relational soft clustering algorithms
International Journal of Computer Mathematics - Bioinformatics
IWANN '09 Proceedings of the 10th International Work-Conference on Artificial Neural Networks: Part I: Bio-Inspired Systems: Computational and Ambient Intelligence
Coding of amino acids by texture descriptors
Artificial Intelligence in Medicine
Predicting palmitoylation sites using a regularised bio-basis function neural network
ISBRA'07 Proceedings of the 3rd international conference on Bioinformatics research and applications
Topology prediction of α-helical and β-barrel transmembrane proteins using RBF networks
ICIC'10 Proceedings of the 6th international conference on Advanced intelligent computing theories and applications: intelligent computing
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Relevant and Non-Redundant Amino Acid Sequence Selection for Protein Functional Site Identification
International Journal of Software Science and Computational Intelligence
Hi-index | 0.00 |
The prediction of protease cleavage sites in proteins is critical to effective drug design. One of the important issues in constructing an accurate and efficient predictor is how to present nonnumerical amino acids to a model effectively. As this issue has not yet been paid full attention and is closely related to model efficiency and accuracy, we present a novel neural learning algorithm aimed at improving the prediction accuracy and reducing the time involved in training. The algorithm is developed based on the conventional radial basis function neural networks (RBFNNs) and is referred to as a bio-basis function neural network (BBFNN). The basic principle is to replace the radial basis function used in RBFNNs by a novel bio-basis function. Each bio-basis is a feature dimension in a numerical feature space, to which a nonnumerical sequence space is mapped for analysis. The bio-basis function is designed using an amino acid mutation matrix verified in biology. Thus, the biological content in protein sequences can be maximally utilized for accurate modeling. Mutual information (MI) is used to select the most informative bio-bases and an ensemble method is used to enhance a decision-making process, hence, improving the prediction accuracy further. The algorithm has been successfully verified in two case studies, namely the prediction of Human Immunodeficiency Virus (HIV) protease cleavage sites and trypsin cleavage sites in proteins.