Efficient mining of emerging patterns: discovering trends and differences
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Sequence mining in categorical domains: incorporating constraints
Proceedings of the ninth international conference on Information and knowledge management
SPADE: an efficient algorithm for mining frequent sequences
Machine Learning
Making use of the most expressive jumping emerging patterns for classification
Knowledge and Information Systems
Detecting Group Differences: Mining Contrast Sets
Data Mining and Knowledge Discovery
Scalable Feature Mining for Sequential Data
IEEE Intelligent Systems
A practical algorithm to find the best subsequence patterns
Theoretical Computer Science
Sequential PAttern mining using a bitmap representation
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
DASFAA '03 Proceedings of the Eighth International Conference on Database Systems for Advanced Applications
On detecting differences between groups
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Frequent-subsequence-based prediction of outer membrane proteins
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
BIDE: Efficient Mining of Frequent Closed Sequences
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach
IEEE Transactions on Knowledge and Data Engineering
Constraint-based mining of episode rules and optimal window sizes
PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
Mining minimal distinguishing subsequence patterns with gap constraints
Knowledge and Information Systems
Frequent Closed Sequence Mining without Candidate Maintenance
IEEE Transactions on Knowledge and Data Engineering
A novel Boolean algebraic framework for association and pattern mining
WSEAS Transactions on Computers
CONTOUR: an efficient algorithm for discovering discriminating subsequences
Data Mining and Knowledge Discovery
A Boolean algebraic framework for association and pattern mining
ICCOMP'08 Proceedings of the 12th WSEAS international conference on Computers
Mining Peculiar Compositions of Frequent Substrings from Sparse Text Data Using Background Texts
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Mining time-delayed associations from discrete event datasets
DASFAA'07 Proceedings of the 12th international conference on Database systems for advanced applications
Efficient mining of minimal distinguishing subgraph patterns from graph databases
PAKDD'08 Proceedings of the 12th Pacific-Asia conference on Advances in knowledge discovery and data mining
Discovery of distinctive patterns in music
Intelligent Data Analysis - Machine Learning and Music
On trace inclusion optimization problems
Cybernetics and Systems Analysis
Interesting-phrase mining for ad-hoc text analytics
Proceedings of the VLDB Endowment
SBAD: sequence based attack detection via sequence comparison
PSDML'10 Proceedings of the international ECML/PKDD conference on Privacy and security issues in data mining and machine learning
Keyword extraction based on sequential pattern mining
Proceedings of the Third International Conference on Internet Multimedia Computing and Service
CLAP: Collaborative pattern mining for distributed information systems
Decision Support Systems
Efficient Mining of Gap-Constrained Subsequences and Its Various Applications
ACM Transactions on Knowledge Discovery from Data (TKDD)
PMBC: Pattern mining from biological sequences with wildcard constraints
Computers in Biology and Medicine
MAIL: mining sequential patterns with wildcards
International Journal of Data Mining and Bioinformatics
Hi-index | 0.00 |
Discovering contrasts between collections of data is an important task in data mining. In this paper, we introduce a new type of contrast pattern, called a Minimal Distinguishing Subsequence (MDS). An MDS is a minimal subsequence that occurs frequently in one class of sequences and infrequently in sequences of another class. It is a natural way of representing strong and succinct contrast information between two sequential datasets and can be useful in applications such as protein comparison, document comparison and building sequential classification models. Mining MDS patterns is a challenging task and is significantly different from mining contrasts between relational/transactional data. One particularly important type of constraint that can be integrated into the mining process is the maximum gap constraint. We present an efficient algorithm called ConSGapMiner, to mine all MDSs according to a maximum gap constraint. It employs highly efficient bitset and boolean operations, for powerful gap based pruning within a prefix growth framework. A performance evaluation with both sparse and dense datasets, demonstrates the scalability of ConSGapMiner and shows its ability to mine patterns from high dimensional datasets at low supports.