SPADE: an efficient algorithm for mining frequent sequences
Machine Learning
Mining Sequential Patterns: Generalizations and Performance Improvements
EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Finding Maximal Repetitions in a Word in Linear Time
FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
From sequential pattern mining to structured pattern mining: a pattern-growth approach
Journal of Computer Science and Technology
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Exhaustive whole-genome tandem repeats search
Bioinformatics
STAR: an algorithm to Search for Tandem Approximate Repeats
Bioinformatics
Finding LPRs in DNA Sequence Based on a New Index — SUA
BIBE '05 Proceedings of the Fifth IEEE Symposium on Bioinformatics and Bioengineering
Frequent pattern mining: current status and future directions
Data Mining and Knowledge Discovery
A Scalable Sequential Pattern Mining Algorithm
AICCSA '06 Proceedings of the IEEE International Conference on Computer Systems and Applications
Mining sequential patterns for protein fold recognition
Journal of Biomedical Informatics
Data & Knowledge Engineering
ICLP '08 Proceedings of the 24th International Conference on Logic Programming
Optimal extraction of motif patterns in 2D
Information Processing Letters
VOGUE: A variable order hidden Markov model with duration based on frequent sequence mining
ACM Transactions on Knowledge Discovery from Data (TKDD)
Prism: An effective approach for frequent sequence mining via prime-block encoding
Journal of Computer and System Sciences
Approximate Repeating Pattern Mining with Gap Requirements
ICTAI '09 Proceedings of the 2009 21st IEEE International Conference on Tools with Artificial Intelligence
Towards bounding sequential patterns
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Efficient Mining of Gap-Constrained Subsequences and Its Various Applications
ACM Transactions on Knowledge Discovery from Data (TKDD)
Hi-index | 0.00 |
Existing algorithms for mining frequent patterns in multiple biosequences may generate multiple projected databases and short candidate patterns, which can increase computation time and memory requirement. In order to overcome such shortcomings, we propose a fast and efficient algorithm for mining frequent patterns in multiple biological sequences (MSPM). We first present the concept of a primary pattern, which can be extended to form larger patterns in the sequence. To detect frequent primary patterns, a prefix tree is constructed. Based on this prefix tree, a pattern-extending approach is also presented to mine frequent patterns without producing a large number of irrelevant candidate patterns. The experimental results show that the MSPM algorithm can achieve not only faster speed, but also higher quality results as compared with other methods.