Algorithms on strings, trees, and sequences: computer science and computational biology
Algorithms on strings, trees, and sequences: computer science and computational biology
Polynomial-time learning of elementary formal systems
New Generation Computing
Discovering Best Variable-Length-Don't-Care Patterns
DS '02 Proceedings of the 5th International Conference on Discovery Science
Finding Best Patterns Practically
Progress in Discovery Science, Final Report of the Japanese Discovery Science Project
Linear-Time Longest-Common-Prefix Computation in Suffix Arrays and Its Applications
CPM '01 Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching
Color Set Size Problem with Application to String Matching
CPM '92 Proceedings of the Third Annual Symposium on Combinatorial Pattern Matching
A Practical Algorithm to Find the Best Episode Patterns
DS '01 Proceedings of the 4th International Conference on Discovery Science
A practical algorithm to find the best subsequence patterns
Theoretical Computer Science
An O(N^2) Algorithm for Discovering Optimal Boolean Pattern Pairs
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Algorithms for String Pattern Discovery
MDAI '07 Proceedings of the 4th international conference on Modeling Decisions for Artificial Intelligence
Hi-index | 0.00 |
We consider the problem of discovering the optimal pattern from a set of strings and associated numeric attribute values. The goodness of a pattern is measured by the correlation between the number of occurrences of the pattern in each string, and the numeric attribute value assigned to the string. We present two algorithms based on suffix trees, that can find the optimal substring pattern in O(Nn) and O(N2) time, respectively, where n is the number of strings and N is their total length. We further present a general branch and bound strategy that can be used when considering more complex pattern classes. We also show that combining the O(N2) algorithm and the branch and bound heuristic increases the efficiency of the algorithm considerably.