Bump hunting in high-dimensional data
Statistics and Computing
Subgroup Discovery with CN2-SD
The Journal of Machine Learning Research
Boosted PRIM with Application to Searching for Oncogenic Pathway of Lung Cancer
CSB '04 Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference
Flexible patient rule induction method for optimizing process variables in discrete type
Expert Systems with Applications: An International Journal
Contrasting temporal trend discovery for large healthcare databases
Computer Methods and Programs in Biomedicine
Hi-index | 0.00 |
We systematically compare the established algorithms CART (Classification and Regression Trees) and PRIM (Patient Rule Induction Method) in a subgroup discovery task on a large real-world high-dimensional clinical database. Contrary to current conjectures, PRIM's performance was generally inferior to CART's. PRIM often considered ''peeling of'' a large chunk of data at a value of a relevant discrete ordinal variable unattractive, ultimately missing an important subgroup. This finding has considerable significance in clinical medicine where ordinal scores are ubiquitous. PRIM's utility in clinical databases would increase when global information about (ordinal) variables is better put to use and when the search algorithm keeps track of alternative solutions.