Suffix arrays: a new method for on-line string searches
SIAM Journal on Computing
Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
Nearest common ancestors: a survey and a new distributed algorithm
Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
Fast algorithm for extracting multiple unordered short motifs using bit operations
Information Sciences—Applications: An International Journal
The Enhanced Suffix Array and Its Applications to Genome Analysis
WABI '02 Proceedings of the Second International Workshop on Algorithms in Bioinformatics
LATIN '00 Proceedings of the 4th Latin American Symposium on Theoretical Informatics
Finding Best Patterns Practically
Progress in Discovery Science, Final Report of the Japanese Discovery Science Project
Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications
CPM '01 Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching
Color Set Size Problem with Application to String Matching
CPM '92 Proceedings of the Third Annual Symposium on Combinatorial Pattern Matching
Linear-time construction of suffix arrays
CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Space efficient linear time construction of suffix arrays
CPM'03 Proceedings of the 14th annual conference on Combinatorial pattern matching
Simple linear work suffix array construction
ICALP'03 Proceedings of the 30th international conference on Automata, languages and programming
Using expression data to discover RNA and DNA regulatory sequence motifs
RRG'04 Proceedings of the 2004 RECOMB international conference on Regulatory Genomics
Algorithms for String Pattern Discovery
MDAI '07 Proceedings of the 4th international conference on Modeling Decisions for Artificial Intelligence
Sparse substring pattern set discovery using linear programming boosting
DS'10 Proceedings of the 13th international conference on Discovery science
Journal of Discrete Algorithms
A new family of string classifiers based on local relatedness
DS'06 Proceedings of the 9th international conference on Discovery Science
Sparse directed acyclic word graphs
SPIRE'06 Proceedings of the 13th international conference on String Processing and Information Retrieval
Practical algorithms for pattern based linear regression
DS'05 Proceedings of the 8th international conference on Discovery Science
Composite pattern discovery for PCR application
SPIRE'05 Proceedings of the 12th international conference on String Processing and Information Retrieval
Hi-index | 0.00 |
We consider the problem of finding the optimal combination of string patterns, which characterizes a given set of strings that have a numeric attribute value assigned to each string.Pattern combinations are scored based on the correlation between their occurrences in the strings and the numeric attribute values. The aim is to find the combination of patterns which is best with respect to an appropriate scoring function. We present an O(N^2) time algorithm for finding the optimal pair of substring patterns combined with Boolean functions, where N is the total length of the sequences. The algorithm looks for all possible Boolean combinations of the patterns, e.g., patterns of the form p \land \lnot q, which indicates that the pattern pair is considered to occur in a given string s, if p occurs in s, AND q does NOT occur in s. An efficient implementation using suffix arrays is presented, and we further show that the algorithm can be adapted to find the best k{\hbox{-}}{\rm pattern} Boolean combination inO(N^k) time. The algorithm is applied to mRNA sequence data sets of moderate size combined with their turnover rates for the purpose of finding regulatory elements that cooperate, complement, or compete with each other in enhancing and/or silencing mRNA decay.