Raising data for improved support in rule mining: How to raise and how far to raise

  • Authors:
  • Geller;Xuan Zhou;Kalpana Prathipati;Sripriya Kanigiluppai;Xiaoming Chen

  • Affiliations:
  • Computer Science Department, New Jersey Institute of Technology, Newark, NJ 07102, USA;Computer Science Department, New Jersey Institute of Technology, Newark, NJ 07102, USA;Computer Science Department, New Jersey Institute of Technology, Newark, NJ 07102, USA;Computer Science Department, New Jersey Institute of Technology, Newark, NJ 07102, USA;Computer Science Department, New Jersey Institute of Technology, Newark, NJ 07102, USA

  • Venue:
  • Intelligent Data Analysis
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes the use of a concept hierarchy for improving the results of association rule mining. Given a large set of tuples with demographic information and personal interest information, association rules can be derived, that associate ages and gender with interests. However, it is a problem to come up with rules with high support whenever the mined data set is sparse. On the other hand, if rules with high support can be generated, they tend to involve interests that are too abstract to be of practical use. To overcome the first problem, we have developed a method of raising data instances to higher levels in the ontology. In this paper we give a formal definition of the raising operation. We also show that in some cases data mining with raised data leads to rules that better represent the reality. In order to avoid the second problem, namely rules that are too abstract, we formulate a notion of an optimal target level for the raising operation. We then derive two estimates for this optimal raising level. Knowing to which level to raise reduces the computational effort of raising to several levels and reduces the user effort of selecting those mined rules that best fit her/his needs from a large candidate set.