An Algorithm for Constrained Association Rule Mining in Semi-structured Data

  • Authors:
  • Lisa Singh;Bin Chen;Rebecca Haight;Peter Scheuermann

  • Affiliations:
  • -;-;-;-

  • Venue:
  • PAKDD '99 Proceedings of the Third Pacific-Asia Conference on Methodologies for Knowledge Discovery and Data Mining
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

The need for sophisticated analysis of textual documents is becoming more apparent as data is being placed on the Web and digital libraries are surfacing. This paper presents an algorithm for generating constrained association rules from textual documents. The user specifies a set of constraints, concepts and/or structured values. Our algorithm creates matrices and lists based on these prespecified constraints and uses them to generate large itemsets. Because these matrices are small and sparse, we are able to quickly generate higher order large itemsets. Further, since we maintain concept relationship information in a concept library, we can also generate rulesets involving concepts related to the initial set of constraints.