Genetic programming: on the programming of computers by means of natural selection
Genetic programming: on the programming of computers by means of natural selection
C4.5: programs for machine learning
C4.5: programs for machine learning
From data mining to knowledge discovery: an overview
Advances in knowledge discovery and data mining
Data mining: concepts and techniques
Data mining: concepts and techniques
Genetic Algorithms in Search, Optimization and Machine Learning
Genetic Algorithms in Search, Optimization and Machine Learning
Clustering Techniques in Biological Sequence Analysis
PKDD '97 Proceedings of the First European Symposium on Principles of Data Mining and Knowledge Discovery
Automatic motif discovery in an enzyme database using a genetic algorithm-based approach
Soft Computing - A Fusion of Foundations, Methodologies and Applications
Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
Soft Computing - A Fusion of Foundations, Methodologies and Applications - Special Issue on Evolutionary and Metaheuristics based Data Mining (EMBDM); Guest Editors: José A. Gámez, María J. del Jesús, José M. Puerta
Self-adapting evolutionary parameters: encoding aspects for combinatorial optimization problems
EvoCOP'05 Proceedings of the 5th European conference on Evolutionary Computation in Combinatorial Optimization
An evolutionary approach for motif discovery and transmembrane protein classification
EC'05 Proceedings of the 3rd European conference on Applications of Evolutionary Computing
Hi-index | 0.00 |
This paper proposes a hybrid algorithm that combines characteristics of both Genetic Programming (GP) and Genetic Algorithms (GAs), for discovering motifs in proteins and predicting their functional classes, based on the discovered motifs. In this algorithm, individuals are represented as IF-THEN classification rules. The rule antecedent consists of a combination of motifs automatically extracted from protein sequences. The rule consequent consists of the functional class predicted for a protein whose sequence satisfies the combination of motifs in the rule antecedent. The system can be used in two different ways. First, as a stand-alone classification system, where the evolved classification rules are directly used to predict the functional classes of proteins. Second, the system can be used just as an "attribute construction" method, discovering motifs that are given, as predictor attributes, to another classification algorithm. In this usage of the system, a classical decision tree induction algorithm was used as the classifier. The proposed system was evaluated in these two scenarios and compared with another Genetic Algorithm designed specifically for the discovery of motifs --- and therefore used only as an attribute construction algorithm. This comparison was performed by mining an enzyme data set extracted from the Protein Data Bank. The best results were obtained when using the proposed hybrid GP/GA as an attribute construction algorithm and performing the classification (using the constructed attributes) with the decision tree induction algorithm.