Motif Extraction and Protein Classification

Authors:
Vered Kunik;Zach Solan;Shimon Edelman;Eytan Ruppin;David Horn
Affiliations:
Tel Aviv University;Tel Aviv University;Cornell University;Tel Aviv University;Tel Aviv University
Venue:
CSB '05 Proceedings of the 2005 IEEE Computational Systems Bioinformatics Conference
Year:
2005

Citing 2
Cited 4

The nature of statistical learning theory

The nature of statistical learning theory
Prediction of Enzyme Classification from Protein Sequence without the Use of Sequence Similarity

Proceedings of the 5th International Conference on Intelligent Systems for Molecular Biology

Subsequence-based feature map for protein function classification

Computational Biology and Chemistry
Grammatical inference in practice: a case study in the biomedical domain

ICGI'06 Proceedings of the 8th international conference on Grammatical Inference: algorithms and applications
Hidden markov model-based time series prediction using motifs for detecting inter-time-serial correlations

Proceedings of the 27th Annual ACM Symposium on Applied Computing
When less is more: improving classification of protein families with a minimal set of global features

WABI'07 Proceedings of the 7th international conference on Algorithms in Bioinformatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a novel unsupervised method for extracting meaningful motifs from biological sequence data. This de novo motif extraction (MEX) algorithm is data driven, finding motifs that are not necessarily over-represented in the data. Applying MEX to the oxidoreductases class of enzymes, containing approximately 7000 enzyme sequences, a relatively small set of motifs is obtained. This set spans a motif-space that is used for functional classification of the enzymes by an SVM classifier. The classification based on MEX motifs surpasses that of two other SVM based methods: SVMProt, a method based on the analysis of physical-chemical properties of a protein generated from its sequence of amino acids, and SVM applied to aSmith-Waterman distances matrix. Our findings demonstrate that the MEX algorithm extracts relevant motifs, supporting a successful sequence-to-function classification.