Mining frequent tree-like patterns in large datasets

  • Authors:
  • Tzung-Shi Chen;Shih-Chun Hsu

  • Affiliations:
  • Department of Information and Learning Technology, National University of Tainan, 33, Section 2, Shu-Lin St., Tainan 700, Taiwan;Department of Information and Learning Technology, National University of Tainan, 33, Section 2, Shu-Lin St., Tainan 700, Taiwan

  • Venue:
  • Data & Knowledge Engineering
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Sequential pattern mining is crucial to data mining domains. This paper proposes a novel data mining approach for exploring hierarchical tree structures, named tree-like patterns, representing the relationships for a pair of items in a sequence. Using tree-like patterns, the relationships for a pair of items can be identified in terms of the cause and effect. A novel technique that efficiently counts support values for tree-like patterns using a queue structure is proposed. In addition, this paper addresses an efficient scheme for determining the frequency of a tree-like pattern in a sequence using a dynamic programming approach. Each tree-like pattern embedded in a sequence is considered to have a certain valuable meaning or the degree of importance used in different applications. Two addressed formulas are applied to determine the degree of significance for a specific sequence, which denotes the degree of consecutive items in a tree-like pattern for a sequence. The larger the degree of significance a tree-like pattern has, the more the tree-like pattern is compacted in the sequence. The characteristics differentiating the explored patterns from those obtained with other schemes are discussed. A simulation analysis of the proposed data mining approach is utilized to demonstrate its efficacy. Finally, the proposed approach is designed and implemented in a data mining system integrated into a novel e-learning platform.