FreeSpan: frequent pattern-projected sequential pattern mining
Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
SPADE: an efficient algorithm for mining frequent sequences
Machine Learning
KDD-Cup 2000 organizers' report: peeling the onion
ACM SIGKDD Explorations Newsletter - Special issue on “Scalable data mining algorithms”
Mining long sequential patterns in a noisy environment
Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Mining sequential patterns with constraints in large databases
Proceedings of the eleventh international conference on Information and knowledge management
Mining Sequential Patterns: Generalizations and Performance Improvements
EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
ICDE '98 Proceedings of the Fourteenth International Conference on Data Engineering
The PSP Approach for Mining Sequential Patterns
PKDD '98 Proceedings of the Second European Symposium on Principles of Data Mining and Knowledge Discovery
SPIRIT: Sequential Pattern Mining with Regular Expression Constraints
VLDB '99 Proceedings of the 25th International Conference on Very Large Data Bases
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Evaluation of Techniques for Classifying Biological Sequences
PAKDD '02 Proceedings of the 6th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining
Sequential PAttern mining using a bitmap representation
Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Mining Top.K Frequent Closed Patterns without Minimum Support
ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
TSP: Mining Top-K Closed Sequential Patterns
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
CLOSET+: searching for the best strategies for mining frequent closed itemsets
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Frequent-subsequence-based prediction of outer membrane proteins
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach
IEEE Transactions on Knowledge and Data Engineering
Parallel mining of closed sequential patterns
Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
C-Miner: Mining Block Correlations in Storage Systems
FAST '04 Proceedings of the 3rd USENIX Conference on File and Storage Technologies
Mining Minimal Distinguishing Subsequence Patterns with Gap Constraints
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Efficiently Mining Frequent Closed Partial Orders
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
CP-Miner: Finding Copy-Paste and Related Bugs in Large-Scale Software Code
IEEE Transactions on Software Engineering
MAPO: mining API usages from open source repositories
Proceedings of the 2006 international workshop on Mining software repositories
A novel Boolean algebraic framework for association and pattern mining
WSEAS Transactions on Computers
MICAI '08 Proceedings of the 7th Mexican International Conference on Artificial Intelligence: Advances in Artificial Intelligence
A Boolean algebraic framework for association and pattern mining
ICCOMP'08 Proceedings of the 12th WSEAS international conference on Computers
Emerging Cubes: Borders, size estimations and lossless reductions
Information Systems
Frequency-based load shedding over a data stream of tuples
Information Sciences: an International Journal
Condensed Representation of Sequential Patterns According to Frequency-Based Measures
IDA '09 Proceedings of the 8th International Symposium on Intelligent Data Analysis: Advances in Intelligent Data Analysis VIII
Discovering hybrid temporal patterns from sequences consisting of point- and interval-based events
Data & Knowledge Engineering
Mining convergent and divergent sequences in multidimensional data
International Journal of Business Intelligence and Data Mining
Proceedings of the 2009 conference on Artificial Intelligence in Education: Building Learning Systems that Care: From Knowledge Representation to Affective Modelling
A flexible and efficient sequential pattern mining algorithm
International Journal of Intelligent Information and Database Systems
Margin-closed frequent sequential pattern mining
Proceedings of the ACM SIGKDD Workshop on Useful Patterns
Mining weighted sequential patterns in a sequence database with a time-interval weight
Knowledge-Based Systems
Learning task models in ill-defined domain using an hybrid knowledge discovery framework
Knowledge-Based Systems
Mining Web navigation patterns with a path traversal graph
Expert Systems with Applications: An International Journal
Fast mining of non-derivable episode rules in complex sequences
MDAI'11 Proceedings of the 8th international conference on Modeling decisions for artificial intelligence
Efficient Mining of Gap-Constrained Subsequences and Its Various Applications
ACM Transactions on Knowledge Discovery from Data (TKDD)
TripRec: recommending trip routes from large scale check-in data
Proceedings of the 21st international conference companion on World Wide Web
On mining clinical pathway patterns from medical behaviors
Artificial Intelligence in Medicine
A general framework to encode heterogeneous information sources for contextual pattern mining
Proceedings of the 21st ACM international conference on Information and knowledge management
Information Sciences: an International Journal
General algorithms for mining closed flexible patterns under various equivalence relations
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
BIDE-Based parallel mining of frequent closed sequences with mapreduce
ICA3PP'12 Proceedings of the 12th international conference on Algorithms and Architectures for Parallel Processing - Volume Part II
An application of improved gap-BIDE algorithm for discovering access patterns
Applied Computational Intelligence and Soft Computing - Special issue on Awareness Science and Engineering
MSGPs: a novel algorithm for mining sequential generator patterns
ICCCI'12 Proceedings of the 4th international conference on Computational Collective Intelligence: technologies and applications - Volume Part II
Closed inter-sequence pattern mining
Journal of Systems and Software
A prediction framework based on contextual data to support Mobile Personalized Marketing
Decision Support Systems
Key roles of closed sets and minimal generators in concise representations of frequent patterns
Intelligent Data Analysis
Hi-index | 0.00 |
Previous studies have presented convincing arguments that a frequent pattern mining algorithm should not mine all frequent patterns but only the closed ones because the latter leads to not only a more compact yet complete result set but also better efficiency. However, most of the previously developed closed pattern mining algorithms work under the candidate maintenance-and-test paradigm, which is inherently costly in both runtime and space usage when the support threshold is low or the patterns become long. In this paper, we present BIDE, an efficient algorithm for mining frequent closed sequences without candidate maintenance. It adopts a novel sequence closure checking scheme called BI-Directional Extension and prunes the search space more deeply compared to the previous algorithms by using the BackScan pruning method. A thorough performance study with both sparse and dense, real, and synthetic data sets has demonstrated that BIDE significantly outperforms the previous algorithm: It consumes an order(s) of magnitude less memory and can be more than an order of magnitude faster. It is also linearly scalable in terms of database size.