Computational Methods for Intelligent Information Access
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Estimating the Jacobian of the Singular Value Decomposition: Theory and Applications
ECCV '00 Proceedings of the 6th European Conference on Computer Vision-Part I
Protein Sequences Classification Using Modular RBF Neural Networks
AI '02 Proceedings of the 15th Australian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
Improving Biological Sequence Property Distances by Using a Genetic Algorithm
IWANN '01 Proceedings of the 6th International Work-Conference on Artificial and Natural Neural Networks: Bio-inspired Applications of Connectionism-Part II
Dimensionality Reduction through Sub-space Mapping for Nearest Neighbor Algorithms
ECML '00 Proceedings of the 11th European Conference on Machine Learning
Mining biomolecular data using background knowledge and artificial neural networks
Handbook of massive data sets
Gene classification artificial neural system
INBS '95 Proceedings of the First International Symposium on Intelligence in Neural and Biological Systems (INBS'95)
New techniques for extracting features from protein sequences
IBM Systems Journal - Deep computing for the life sciences
Peptide programs: applying fragment programs to protein classification
Proceedings of the 2nd international workshop on Data and text mining in bioinformatics
The Learning Grid and E-Assessment using Latent Semantic Analysis
Proceedings of the 2005 conference on Towards the Learning Grid: Advances in Human Learning Services
Early prediction of temporal sequences based on information transfer
WAIM'11 Proceedings of the 12th international conference on Web-age information management
Integrated mining for cancer incidence factors from healthcare data
AM'03 Proceedings of the Second international conference on Active Mining
Fast protein superfamily classification using principal component null space analysis
AI'05 Proceedings of the 18th Canadian Society conference on Advances in Artificial Intelligence
E-assessment using latent semantic analysis
3LeGE-WG'03 Proceedings of the 3rd international LeGE-WG conference on GRID Infrastructure to Support Future Technology Enhanced Learning
Gathering requirements for a grid-based automatic marking system
ELeGI'05 Proceedings of the 1st international ELeGI conference on Advanced Technology for Enhanced Learning
Hi-index | 0.00 |
A neural network classification method has been developed as an alternative approach to the search/organization problem of protein sequence databases. The neural networks used are three-layered, feed-forward, back-propagation networks. The protein sequences are encoded into neural input vectors by a hashing method that counts occurrences of n-gram words. A new SVD (singular value decomposition) method, which compresses the long and sparse n-gram input vectors and captures semantics of n-gram words, has improved the generalization capability of the network. A full-scale protein classification system has been implemented on a Cray supercomputer to classify unknown sequences into 3311 PIR (Protein Identification Resource) superfamilies/families at a speed of less than 0.05 CPU second per sequence. The sensitivity is close to 90% overall, and approaches 100% for large superfamilies. The system could be used to reduce the database search time and is being used to help organize the PIR protein sequence database.