Efficient mining of emerging patterns: discovering trends and differences
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
C4.5: Programs for Machine Learning
C4.5: Programs for Machine Learning
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth
Proceedings of the 17th International Conference on Data Engineering
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
CLOSET+: searching for the best strategies for mining frequent closed itemsets
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
CloseGraph: mining closed frequent graph patterns
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
BIDE: Efficient Mining of Frequent Closed Sequences
ICDE '04 Proceedings of the 20th International Conference on Data Engineering
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Parallel mining of closed sequential patterns
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Efficient mining of iterative patterns for software specification discovery
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient mining of frequent sequence generators
Proceedings of the 17th international conference on World Wide Web
Mining significant graph patterns by leap search
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Non-redundant sequential rules-Theory and algorithm
Information Systems
Efficient Mining of Closed Repetitive Gapped Subsequences from a Sequence Database
ICDE '09 Proceedings of the 2009 IEEE International Conference on Data Engineering
Classification of software behaviors for failure detection: a discriminative pattern mining approach
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
EMNLP '09 Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 3 - Volume 3
A discriminative model approach for accurate duplicate bug report retrieval
Proceedings of the 32nd ACM/IEEE International Conference on Software Engineering - Volume 1
General algorithms for mining closed flexible patterns under various equivalence relations
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
Investigating clinical care pathways correlated with outcomes
BPM'13 Proceedings of the 11th international conference on Business Process Management
Hi-index | 0.00 |
A lot of data are in sequential formats. In this study, we are interested in sequential data that goes in pairs. There are many interesting datasets in this format coming from various domains including parallel textual corpora, duplicate bug reports, and other pairs of related sequences of events. Our goal is to mine a set of closed discriminative dyadic sequential patterns from a database of sequence pairs each belonging to one of the two classes +ve and -ve. These dyadic sequential patterns characterize the discriminating facets contrasting the two classes. They are potentially good features to be used for the classification of dyadic sequential data. They can be used to characterize and flag correct and incorrect translations from parallel textual corpora, automate the manual and time consuming duplicate bug report detection process, etc. We provide a solution of this new problem by proposing new search space traversal strategy, projected database structure, pruning properties, and novel mining algorithms. To demonstrate the scalability and utility of our solution, we have experimented with both synthetic and real datasets. Experiment results show that our solution is scalable. Mined patterns are also able to improve the accuracy of one possible downstream application, namely the detection of duplicate bug reports using pattern-based classification.