Planted (l,d) motif finding with allowable mismatches using kernel based approach

  • Authors:
  • Anjali Mohapatra;P. M. Mishra;S Padhy

  • Affiliations:
  • IIIT Bhubaneswar, Odisha, India;Govt. of Odisha, India;IIIT Bhubaneswar, Odisha, India

  • Venue:
  • Proceedings of the 2011 International Conference on Communication, Computing & Security
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

For the last few years there has been a growing interest in discovery of significant patterns in biological sequences that correspond to some structural and/or functional feature of the bio-molecule known as motifs and has important application in determining regulatory sites, splice sites, promoter sequence and drug target identification. Identification of motif is challenging because it exists in different sequences in various mutated forms. Despite extensive studies over the last few years using several approaches such as statistical, exhaustive, heuristic etc. this problem is far from being satisfactorily solved. In this paper, we consider planted (l,d) motif search problem in a given set of DNA sequences using a kernel based approach. The proposed kernel is evaluated using synthetic data and also on real data sets from different organisms such as yeast and worm. The results on these datasets indicate improved performance of the proposed kernel by allowing classification of DNA sequences with larger motif lengths.