Comparing association rules and decision trees for disease prediction

  • Authors:
  • Carlos Ordonez

  • Affiliations:
  • University of Houston, Houston, TX

  • Venue:
  • HIKM '06 Proceedings of the international workshop on Healthcare information and knowledge management
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Association rules represent a promising technique to find hidden patterns in a medical data set. The main issue about mining association rules in a medical data set is the large number of rules that are discovered, most of which are irrelevant. Such number of rules makes search slow and interpretation by the domain expert difficult. In this work, search constraints are introduced to find only medically significant association rules and make search more efficient. In medical terms, association rules relate heart perfusion measurements and patient risk factors to the degree of stenosis in four specific arteries. Association rule medical significance is evaluated with the usual support and confidence metrics, but also lift. Association rules are compared to predictive rules mined with decision trees, a well-known machine learning technique. Decision trees are shown to be not as adequate for artery disease prediction as association rules. Experiments show decision trees tend to find few simple rules, most rules have somewhat low reliability, most attribute splits are different from medically common splits, and most rules refer to very small sets of patients. In contrast, association rules generally include simpler predictive rules, they work well with user-binned attributes, rule reliability is higher and rules generally refer to larger sets of patients.