A training algorithm for optimal margin classifiers
COLT '92 Proceedings of the fifth annual workshop on Computational learning theory
The nature of statistical learning theory
The nature of statistical learning theory
Unsupervised Learning of Multiple Motifs in Biopolymers Using Expectation Maximization
Machine Learning - Special issue on applications in molecular biology
Combinatorial Approaches to Finding Subtle Signals in DNA Sequences
Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Kernel Methods for Pattern Analysis
Kernel Methods for Pattern Analysis
Large Scale Multiple Kernel Learning
The Journal of Machine Learning Research
Hi-index | 0.00 |
For the last few years there has been a growing interest in discovery of significant patterns in biological sequences that correspond to some structural and/or functional feature of the bio-molecule known as motifs and has important application in determining regulatory sites, splice sites, promoter sequence and drug target identification. Identification of motif is challenging because it exists in different sequences in various mutated forms. Despite extensive studies over the last few years using several approaches such as statistical, exhaustive, heuristic etc. this problem is far from being satisfactorily solved. In this paper, we consider planted (l,d) motif search problem in a given set of DNA sequences using a kernel based approach. The proposed kernel is evaluated using synthetic data and also on real data sets from different organisms such as yeast and worm. The results on these datasets indicate improved performance of the proposed kernel by allowing classification of DNA sequences with larger motif lengths.