PRIM versus CART in subgroup discovery: When patience is harmful

  • Authors:
  • Ameen Abu-Hanna;Barry Nannings;Dave Dongelmans;Arie Hasman

  • Affiliations:
  • Department of Medical Informatics, Academic Medical Center, University of Amsterdam, Meibergdreef 9, 1105 AZ Amsterdam, The Netherlands;Department of Medical Informatics, Academic Medical Center, University of Amsterdam, Meibergdreef 9, 1105 AZ Amsterdam, The Netherlands;Department of Intensive Care, Academic Medical Center, University of Amsterdam, Meibergdreef 9, 1105 AZ Amsterdam, The Netherlands;Department of Medical Informatics, Academic Medical Center, University of Amsterdam, Meibergdreef 9, 1105 AZ Amsterdam, The Netherlands

  • Venue:
  • Journal of Biomedical Informatics
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We systematically compare the established algorithms CART (Classification and Regression Trees) and PRIM (Patient Rule Induction Method) in a subgroup discovery task on a large real-world high-dimensional clinical database. Contrary to current conjectures, PRIM's performance was generally inferior to CART's. PRIM often considered ''peeling of'' a large chunk of data at a value of a relevant discrete ordinal variable unattractive, ultimately missing an important subgroup. This finding has considerable significance in clinical medicine where ordinal scores are ubiquitous. PRIM's utility in clinical databases would increase when global information about (ordinal) variables is better put to use and when the search algorithm keeps track of alternative solutions.