PRUNER: Algorithms for Finding Monad Patterns in DNA Sequences

Authors:
Ravi VijayaSatya;Amar Mukherjee
Affiliations:
University of Central Florida;University of Central Florida
Venue:
CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
Year:
2004

Citing 3
Cited 0

Finding motifs using random projections

RECOMB '01 Proceedings of the fifth annual international conference on Computational biology
Combinatorial Approaches to Finding Subtle Signals in DNA Sequences

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
cWINNOWER Algorithm for Finding Fuzzy DNA Motifs

CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present new algorithms for discovering monad patterns in DNA sequences. Monad patterns are of the form (l, d)-k, where l is the length of the pattern, d is the maximum number of mismatches allowed, and k is the minimum number of times the pattern is repeated in the given sample. The time-complexity of some of the best known algorithms to date is O(nt^2 l^d \left| \sum\right|^d ), where t is the number of input sequences, and n is the length of each input sequence. The first algorithm that we present takes O(n^2 t^2 l^{{d \mathord{\left/ {\vphantom {d 2}} \right. \kern-\nulldelimiterspace} 2}}\left| \Sigma\right|^{{d \mathord{\left/ {\vphantom {d 2}} \right. \kern-\nulldelimiterspace} 2}} ) and space 0(ntl^{{d \mathord{\left/ {\vphantom {d 2}} \right. \kern-\nulldelimiterspace} 2}} \left| \Sigma\right|^{{d \mathord{\left/ {\vphantom {d 2}} \right. \kern-\nulldelimiterspace} 2}} ), and the second algorithm takes 0(n^3 t^3 l^{{d \mathord{\left/ {\vphantom {d 2}} \right. \kern-\nulldelimiterspace} 2}} \left| \Sigma\right|^{{d \mathord{\left/ {\vphantom {d 2}} \right. \kern-\nulldelimiterspace} 2}} ) time using 0(l^{{d \mathord{\left/ {\vphantom {d 2}} \right. \kern-\nulldelimiterspace} 2}} \left| \Sigma\right|^{{d \mathord{\left/ {\vphantom {d 2}} \right. \kern-\nulldelimiterspace} 2}} ) space. In practice, our algorithms have much better performance provided the d/l ratio is small.