Employing heat maps to mine associations in structured routine care data

  • Authors:
  • Dennis Toddenroth;Thomas Ganslandt;Ixchel Castellanos;Hans-Ulrich Prokosch;Thomas Bürkle

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • Artificial Intelligence in Medicine
  • Year:
  • 2014

Quantified Score

Hi-index 0.00

Visualization

Abstract

Objective: Mining the electronic medical record (EMR) has the potential to deliver new medical knowledge about causal effects, which are hidden in statistical associations between different patient attributes. It is our goal to detect such causal mechanisms within current research projects which include e.g. the detection of determinants of imminent ICU readmission. An iterative statistical approach to examine each set of considered attribute pairs delivers potential answers but is difficult to interpret. Therefore, we aimed to improve the interpretation of the resulting matrices by the use of heat maps. We propose strategies to adapt heat maps for the search for associations and causal effects within routine EMR data. Methods: Heat maps visualize tabulated metric datasets as grid-like choropleth maps, and thus present measures of association between numerous attribute pairs clearly arranged. Basic assumptions about plausible exposures and outcomes are used to allocate distinct attribute sets to both matrix dimensions. The image then avoids certain redundant graphical elements and provides a clearer picture of the supposed associations. Specific color schemes have been chosen to incorporate preexisting information about similarities between attributes. The use of measures of association as a clustering input has been taken as a trigger to apply transformations which ensure that distance metrics always assume finite values and treat positive and negative associations in the same way. To evaluate the general capability of the approach, we conducted analyses of simulated datasets and assessed diagnostic and procedural codes in a large routine care dataset. Results: Simulation results demonstrate that the proposed clustering procedure rearranges attributes similar to simulated statistical associations. Thus, heat maps are an excellent tool to indicate whether associations concern the same attributes or different ones, and whether affected attribute sets conform to any preexisting relationship between attributes. The dendrograms help in deciding if contiguous sequences of attributes effectively correspond to homogeneous attribute associations. The exemplary analysis of a routine care dataset revealed patterns of associations that follow plausible medical constellations for several diseases and the associated medical procedures and activities. Cases with breast cancer (ICD C50), for example, appeared to be associated with radiation therapy (8-52). In cross check, approximately 60 percent of the attribute pairs in this dataset showed a strong negative association, which can be explained by diseases treated in a medical specialty which routinely does not perform the respective procedures in these cases. The corresponding diagram clearly reflects these relationships in the shape of coherent subareas. Conclusion: We could demonstrate that heat maps of measures of association are effective for the visualization of patterns in routine care EMRs. The adjustable method for the assignment of attributes to image dimensions permits a balance between the display of ample information and a favorable level of graphical complexity. The scope of the search can be adapted by the use of pre-existing assumptions about plausible effects to select exposure and outcome attributes. Thus, the proposed method promises to simplify the detection of undiscovered causal effects within routine EMR data.