An effective approach for mining frequent patterns in multiple biological sequences

Authors:
Ling Chen;Wei Liu
Affiliations:
Yangzhou University, Yangzhou, China;Yangzhou University, Yangzhou, China
Venue:
Proceedings of the 2nd ACM Conference on Bioinformatics, Computational Biology and Biomedicine
Year:
2011

Citing 3
Cited 0

Finding Maximal Repetitions in a Word in Linear Time

FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
Finding approximate repetitions under Hamming distance

Theoretical Computer Science - Logic and complexity in computer science
STAR: an algorithm to Search for Tandem Approximate Repeats

Bioinformatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most of the existing algorithms for mining frequent patterns in multiple biosequences could produce lots of projected databases and short candidate patterns which could increase the time and memory cost of mining. In order to overcome such shortcoming, a fast and efficient algorithm named MSPM for mining frequent patterns in multiple biological sequences is proposed. We first present the concept of primary pattern, and then use prefix tree for mining frequent primary patterns. A pattern extending approach is also presented to mine all the frequent patterns without producing large amount of irrelevant patterns. Our experimental results show that MSPM not only improves the performance but also achieves effective mining results.