Theoretical Computer Science
Combinatorial pattern discovery for scientific data: some preliminary results
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Transversing itemset lattices with statistical metric pruning
PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Polynomial-time learning of elementary formal systems
New Generation Computing
Progress in Discovery Science, Final Report of the Japanese Discovery Science Project
CPM '01 Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching
A Practical Algorithm to Find the Best Subsequence Patterns
DS '00 Proceedings of the Third International Conference on Discovery Science
A Practical Algorithm to Find the Best Episode Patterns
DS '01 Proceedings of the 4th International Conference on Discovery Science
Online Construction of Subsequence Automata for Multiple Texts
SPIRE '00 Proceedings of the Seventh International Symposium on String Processing Information Retrieval (SPIRE'00)
Discovering Best Variable-Length-Don't-Care Patterns
DS '02 Proceedings of the 5th International Conference on Discovery Science
Episode directed acyclic subsequence graph
Nordic Journal of Computing
An O(N^2) Algorithm for Discovering Optimal Boolean Pattern Pairs
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Algorithms for String Pattern Discovery
MDAI '07 Proceedings of the 4th international conference on Modeling Decisions for Artificial Intelligence
Journal of Discrete Algorithms
A new family of string classifiers based on local relatedness
DS'06 Proceedings of the 9th international conference on Discovery Science
Practical algorithms for pattern based linear regression
DS'05 Proceedings of the 8th international conference on Discovery Science
Composite pattern discovery for PCR application
SPIRE'05 Proceedings of the 12th international conference on String Processing and Information Retrieval
Hi-index | 0.00 |
Finding a pattern which separates two sets is a critical task in discovery. Given two sets of strings, consider the problem to find a subsequence that is common to one set but never appears in the other set. The problem is known to be NP-complete. Episode pattern is a generalized concept of subsequence pattern where the length of substring containing the subsequence is bounded. We generalize these problems to optimization problems, and give practical algorithms to solve them exactly. Our algorithms utilize some pruning heuristics based on the combinatorial properties of strings, and efficient data structures which recognize subsequence and episode patterns.