Theoretical Computer Science
On the complexity of learning strings and sequences
COLT '91 Proceedings of the fourth annual workshop on Computational learning theory
Combinatorial pattern discovery for scientific data: some preliminary results
SIGMOD '94 Proceedings of the 1994 ACM SIGMOD international conference on Management of data
Transversing itemset lattices with statistical metric pruning
PODS '00 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Polynomial-time learning of elementary formal systems
New Generation Computing
Machine Learning
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Polynomial Time Inference of Extended Regular Pattern Languages
Proceedings of RIMS Symposium on Software Science and Engineering
Color Set Size Problem with Application to String Matching
CPM '92 Proceedings of the Third Annual Symposium on Combinatorial Pattern Matching
CPM '97 Proceedings of the 8th Annual Symposium on Combinatorial Pattern Matching
A Practical Algorithm to Find the Best Episode Patterns
DS '01 Proceedings of the 4th International Conference on Discovery Science
Online Construction of Subsequence Automata for Multiple Texts
SPIRE '00 Proceedings of the Seventh International Symposium on String Processing Information Retrieval (SPIRE'00)
Mining Minimal Distinguishing Subsequence Patterns with Gap Constraints
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Mining minimal distinguishing subsequence patterns with gap constraints
Knowledge and Information Systems
Algorithms for String Pattern Discovery
MDAI '07 Proceedings of the 4th international conference on Modeling Decisions for Artificial Intelligence
Correlated itemset mining in ROC space: a constraint programming approach
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Sparse substring pattern set discovery using linear programming boosting
DS'10 Proceedings of the 13th international conference on Discovery science
String analysis technique for shopping path in a supermarket
Journal of Intelligent Information Systems
A new family of string classifiers based on local relatedness
DS'06 Proceedings of the 9th international conference on Discovery Science
Practical algorithms for pattern based linear regression
DS'05 Proceedings of the 8th international conference on Discovery Science
Hi-index | 5.23 |
Given two sets of strings, consider the problem to find a subsequence that is common to one set but never appears in the other set. We regard it to find a subsequence pattern which separates these two sets. The problem is known to be NP-complete. We naturally generalize it to an optimization problem, where we try to find a subsequence pattern which maximally separates these two sets. We provide a practical algorithm to solve it exactly. Our algorithm uses two pruning heuristics based on the properties of subsequence languages, and utilizes the data structure called subsequence automata. We report some experimental results, which show these heuristics and the data structure contribute to reduce the search time.