MAIL: mining sequential patterns with wildcards
International Journal of Data Mining and Bioinformatics
Hi-index | 0.00 |
Mining frequent patterns with a gap requirement from sequences is an important step in many domains, such as biological sciences. Given a character sequence S of length L, a certain threshold and a gap constraint, we aim to discover frequent patterns whose supports in S are no less than the given threshold value. A frequent pattern P can have wildcards, and the numbers of the wildcards between elements of P must fulfill user-specified gap constraints. Also, this mining process satisfies the one-off condition and an Apriori-like property to be efficient. Experiments show that our method can mine as many frequent patterns with wildcards as the existing MPP algorithm, but has a much better performance in time.