Multilayer feedforward networks are universal approximators
Neural Networks
Learning internal representations by error propagation
Parallel distributed processing: explorations in the microstructure of cognition, vol. 1
Fundamentals of Artificial Neural Networks
Fundamentals of Artificial Neural Networks
A Hybrid Algorithm for Determining Protein Structure
IEEE Expert: Intelligent Systems and Their Applications
Protein Fold Class Prediction: New Methods of Statistical Classification
Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology
Role and Results of statistical methods in protein fold class prediction
Mathematical and Computer Modelling: An International Journal
Hi-index | 0.98 |
A representative subset of protein chains were selected from the CATH 2.4 database [C.A. Orengo, A.D. Michie, S. Jones, D.T. Jones, M.B. Swindells, J.M. Thornton, CATH-a hierarchic classification of protein domain structures, Structure 5 (8) (1997) 1093-1108], and were used for training a feed-forward neural network in order to predict protein fold classes by using as input the dipeptide frequency matrix and as output a novel representation of the protein chains in R^3^0 space, based on knot invariant values [P. Rogen, B. Fain, Automatic classification of protein structure by using Gauss integrals, Proceedings of the National Academy of Sciences of the United States of America 100 (1) (2003) 119-124; P. Rogen, H.G. Bohr, A new family of global protein shape descriptors, Mathematical Biosciences 182 (2) (2003) 167-181]. In the general case when excluding singletons (proteins representing a topology or a sequence homology as unique members of these sets), the success rates for the predictions were 77% for class level, 60% for architecture, and 48% for topology. The total number of fold classes that are included in the present data set (~500) is ten times that which has been reported in earlier attempts, so this result represents an improvement on previous work (reporting on a few handpicked folds). Furthermore, distance analysis of the network outputs resulting from singletons shows that it is possible to detect novel topologies with very high confidence (~85%), and the network can in these cases be used as a sorting mechanism that identifies sequences which might need special attention. Also, a direct measure of prediction confidence may be obtained from such distance analysis.