Numerical recipes in C (2nd ed.): the art of scientific computing
Numerical recipes in C (2nd ed.): the art of scientific computing
Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
A Space-Economical Suffix Tree Construction Algorithm
Journal of the ACM (JACM)
Reducing the space requirement of suffix trees
Software—Practice & Experience
Spelling Approximate Repeated or Common Motifs Using a Suffix Tree
LATIN '98 Proceedings of the Third Latin American Symposium on Theoretical Informatics
Color Set Size Problem with Application to String Matching
CPM '92 Proceedings of the Third Annual Symposium on Combinatorial Pattern Matching
Truncated suffix trees and their application to data compression
Theoretical Computer Science
RECOMB '04 Proceedings of the eighth annual international conference on Resaerch in computational molecular biology
A discriminative model for identifying spatial cis-regulatory modules
RECOMB '04 Proceedings of the eighth annual international conference on Resaerch in computational molecular biology
A Method to Find Sequentially Separated Motifs in Biological Sequences (SSMBS)
PRIB '08 Proceedings of the Third IAPR International Conference on Pattern Recognition in Bioinformatics
Suffix tree characterization of maximal motifs in biological sequences
Theoretical Computer Science
Automated extraction of extended structured motifs using multi-objective genetic algorithm
Expert Systems with Applications: An International Journal
ISBRA'08 Proceedings of the 4th international conference on Bioinformatics research and applications
Component-based matching for multiple interacting RNA sequences
ISBRA'11 Proceedings of the 7th international conference on Bioinformatics research and applications
Trie-based apriori motif discovery approach
ISBRA'12 Proceedings of the 8th international conference on Bioinformatics Research and Applications
International Journal of Data Mining and Bioinformatics
Characterization and extraction of irredundant tandem motifs
SPIRE'12 Proceedings of the 19th international conference on String Processing and Information Retrieval
Hi-index | 0.00 |
We propose a new algorithm for identifying cis-regulatory modules in genomic sequences. The proposed algorithm, named RISO, uses a new data structure, called box-link, to store the information about conserved regions that occur in a well-ordered and regularly spaced manner in the data set sequences. This type of conserved regions, called structured motifs, is extremely relevant in the research of gene regulatory mechanisms since it can effectively represent promoter models. The complexity analysis shows a time and space gain over the best known exact algorithms that is exponential in the spacings between binding sites. A full implementation of the algorithm was developed and made available online. Experimental results show that the algorithm is much faster than existing ones, sometimes by more than four orders of magnitude. The application of the method to biological data sets shows its ability to extract relevant consensi.