Subgroup Discovery in Data Sets with Multi---dimensional Responses: A Method and a Case Study in Traumatology

  • Authors:
  • Lan Umek;Blaž Zupan;Marko Toplak;Annie Morin;Jean-Hugues Chauchat;Gregor Makovec;Dragica Smrke

  • Affiliations:
  • Faculty of Computer and Information Sciences, University of Ljubljana, Slovenia;Faculty of Computer and Information Sciences, University of Ljubljana, Slovenia and Dept. of Human and Mol. Genetics, Baylor College of Medicine, Houston, USA;Faculty of Computer and Information Sciences, University of Ljubljana, Slovenia;IRISA, Universite de Rennes 1, Rennes cedex, France 35042;Universite de Lyon, ERIC-Lyon 2, Bron Cedex, France 69676;Dept. of Traumatology, University Clinical Centre, Ljubljana, Slovenia;Dept. of Traumatology, University Clinical Centre, Ljubljana, Slovenia

  • Venue:
  • AIME '09 Proceedings of the 12th Conference on Artificial Intelligence in Medicine: Artificial Intelligence in Medicine
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Biomedical experimental data sets may often include many features both at input (description of cases, treatments, or experimental parameters) and output (outcome description). State-of-the-art data mining techniques can deal with such data, but would consider only one output feature at the time, disregarding any dependencies among them. In the paper, we propose the technique that can treat many output features simultaneously, aiming at finding subgroups of cases that are similar both in input and output space. The method is based on k -medoids clustering and analysis of contingency tables, and reports on case subgroups with significant dependency in input and output space. We have used this technique in explorative analysis of clinical data on femoral neck fractures. The subgroups discovered in our study were considered meaningful by the participating domain expert, and sparked a number of ideas for hypothesis to be further experimentally tested.