Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations
Nucleic Acid and Protein Sequence Analysis
Nucleic Acid and Protein Sequence Analysis
Protein classification using a neural network database system
ANNA '91 Proceedings of the conference on Analysis of neural network applications
Combining artificial neural networks and statistics for stock-market forecasting
CSC '93 Proceedings of the 1993 ACM conference on Computer science
Neural networks for molecular sequence database management
CSC '91 Proceedings of the 19th annual conference on Computer Science
A bibliography on computational molecular biology and genetics
Mathematical and Computer Modelling: An International Journal
Hi-index | 0.00 |
Database search for molecular sequence homologies is the most direct computational approach to decipher gene sequence for protein structure and function. However, the rapid accumulation of sequence data has made the search increasingly difficult using traditional algorithms and pattern-matching software. Our approach to this problem is to develop a domain artificial neural system (ANS) for gene classification by combining a new database design with the neural network theory. A new database, consisting only of identifiable protein domain classes, should be created to reduce the size of the molecular database to be searched. Each entry of the database should contain the domain consensus sequence and other domain features, both of which can be compiled from the NBRF-PIR protein sequence database. The domain database would then be embedded in a neural network so that the search problem is replaced by a pattern recognition problem. The domain ANS would be a three-layered network implemented with the back propagation learning algorithm. The inputs to the system are: the sequence string, which is mapped onto 400 units using a hashing function; and the sequence features, which are mapped onto 36 units. The outputs of the system are identification tags for each of the domain classes. A prototype domain ANS is being developed to map the consensus sequence and sequence features of each of the 41 domains in the training sets to its domain class. Once trained, the domain ANS would allow an easy, rapid gene classification for both intra-class and inter-class protein domains.