Mining Minimal Distinguishing Subsequence Patterns with Gap Constraints
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Mining minimal distinguishing subsequence patterns with gap constraints
Knowledge and Information Systems
Efficient String Mining under Constraints Via the Deferred Frequency Index
ICDM '08 Proceedings of the 8th industrial conference on Advances in Data Mining: Medical Applications, E-Commerce, Marketing, and Theoretical Aspects
Using emerging subsequence in classifying protein structural class
FSKD'09 Proceedings of the 6th international conference on Fuzzy systems and knowledge discovery - Volume 1
Discovery of distinctive patterns in music
Intelligent Data Analysis - Machine Learning and Music
Fast q-gram mining on SLP compressed strings
SPIRE'11 Proceedings of the 18th international conference on String processing and information retrieval
Optimal string mining under frequency constraints
PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases
Distributed string mining for high-throughput sequencing data
WABI'12 Proceedings of the 12th international conference on Algorithms in Bioinformatics
Fast q-gram mining on SLP compressed strings
Journal of Discrete Algorithms
Hi-index | 0.01 |
We introduce a new type of KDD patterns called emerging substrings. In a sequence database, an emerging substring (ES) of a data class is a substring which occurs morefrequently in that class rather than in other classes. ESs areimportant to sequence classification as they capture significant contrasts between data classes and provide insightsfor the construction of sequence classifiers. We propose asuffix tree-based framework for mining ESs, and study theeffectiveness of applying one or more pruning techniques indifferent stages of our ES mining algorithm. Experimentalresults show that if the target class is of a small population with respect to the whole database, which is the normal scenario in single-class ES mining, most of the pruningtechniques would achieve considerable performance gain.