Protein secondary structure prediction using rule induction from covering

Authors:
Leong Lee;Jennifer L. Leopold;Ronald L. Frank;Anne M. Maglia
Affiliations:
Department of Computer Science, Missouri University of Science and Technology, Rolla, MO;Department of Computer Science, Missouri University of Science and Technology, Rolla, MO;Department of Biological Sciences, Missouri University of Science and Technology, Rolla, MO;Department of Biological Sciences, Missouri University of Science and Technology, Rolla, MO
Venue:
CIBCB'09 Proceedings of the 6th Annual IEEE conference on Computational Intelligence in Bioinformatics and Computational Biology
Year:
2009

Citing 5
Cited 0

Rough classification

International Journal of Man-Machine Studies
Managing Uncertainty in Expert Systems

Managing Uncertainty in Expert Systems
Finding Patterns in Three-Dimensional Graphs: Algorithms and Applications to Scientific Data Mining

IEEE Transactions on Knowledge and Data Engineering
Identifying character non-independence in phylogenetic data using data mining techniques

APBC '04 Proceedings of the second conference on Asia-Pacific bioinformatics - Volume 29
Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques

Quantified Score

Hi-index	0.00

Visualization

Abstract

With the increase of data from genome sequencing projects comes the need for reliable and efficient methods for the analysis and classification of protein motifs and domains. Experimental methods currently used to determine protein structure are accurate, yet expensive both in terms of time and equipment. Therefore, various computational approaches to solving the problem have been attempted, although their accuracy has rarely exceeded 75%. In this paper, a rule-based method to predict protein secondary structure is presented. This method uses a newly developed data-mining algorithm called RT-RICO (Relaxed Threshold Rule Induction from Coverings), which identifies dependencies between amino acids in a protein sequence, and generates rules that can be used to predict secondary structures. The average prediction accuracy on sample data sets, or Q3 score, using RTRICO was 80.3%, an improvement over comparable computational methods.