Developing an efficient knowledge discovering model for mining fuzzy multi-level sequential patterns in sequence databases

Authors:
Tony Cheng-Kui Huang
Affiliations:
Department of Business Administration, National Chung Cheng University, 168, University Rd., Min-Hsiung, Chia-Yi, Taiwan, Republic of China
Venue:
Fuzzy Sets and Systems
Year:
2009

Citing 15
Cited 0

FreeSpan: frequent pattern-projected sequential pattern mining

Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining
SPADE: an efficient algorithm for mining frequent sequences

Machine Learning
Mining hybrid sequential patterns and sequential rules

Information Systems
Fuzzy association rules and the extended mining algorithms

Information Sciences—Informatics and Computer Science: An International Journal
Mining Sequential Patterns: Generalizations and Performance Improvements

EDBT '96 Proceedings of the 5th International Conference on Extending Database Technology: Advances in Database Technology
Mining Sequential Patterns

ICDE '95 Proceedings of the Eleventh International Conference on Data Engineering
Mining Generalized Association Rules

VLDB '95 Proceedings of the 21th International Conference on Very Large Data Bases
Fuzzy data mining for interesting generalized association rules

Fuzzy Sets and Systems - Theme: Learning and modeling
Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach

IEEE Transactions on Knowledge and Data Engineering
Mining Sequential Patterns from Multidimensional Sequence Data

IEEE Transactions on Knowledge and Data Engineering
Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques
Mining fuzzy sequential patterns from quantitative transactions

Soft Computing - A Fusion of Foundations, Methodologies and Applications
HYPE: mining hierarchical sequential patterns

DOLAP '06 Proceedings of the 9th ACM international workshop on Data warehousing and OLAP
A novel knowledge discovering model for mining fuzzy multi-level sequential patterns in sequence databases

Data & Knowledge Engineering
Introduction to Algorithms, Third Edition

Introduction to Algorithms, Third Edition

Quantified Score

Hi-index	0.20

Visualization

Abstract

Sequential pattern mining from sequence databases has been recognized as an important data mining problem with various applications. Items in a sequence database can be organized into a concept hierarchy according to taxonomy. Based on the hierarchy, sequential patterns can be found not only at the leaf nodes (individual items) of the hierarchy, but also at higher levels of the hierarchy; this is called multiple-level sequential pattern mining. In previous research, taxonomies based on crisp relationships between any two disjointed levels, however, cannot handle the uncertainties and fuzziness in real life. For example, Tomatoes could be classified into the Fruit category, but could be also regarded as the Vegetable category. To deal with the fuzzy nature of taxonomy, Chen and Huang developed a novel knowledge discovering model to mine fuzzy multi-level sequential patterns, where the relationships from one level to another can be represented by a value between 0 and 1. In their work, a generalized sequential patterns (GSP)-like algorithm was developed to find fuzzy multi-level sequential patterns. This algorithm, however, faces a difficult problem since the mining process may have to generate and examine a huge set of combinatorial subsequences and requires multiple scans of the database. In this paper, we propose a new efficient algorithm to mine this type of pattern based on the divide-and-conquer strategy. In addition, another efficient algorithm is developed to discover fuzzy cross-level sequential patterns. Since the proposed algorithm greatly reduces the candidate subsequence generation efforts, the performance is improved significantly. Experiments show that the proposed algorithm is much more efficient and scalable than the previous one. In mining real-life databases, our works enhance the model's practicability and could promote more applications in business.