Pixelisation-based statistical visualisation for categorical datasets with spreadsheet software

  • Authors:
  • Gaj Vidmar

  • Affiliations:
  • University of Ljubljana, Faculty of Medicine, Institute of Biomedical Informatics, Ljubljana, Slovenia

  • Venue:
  • VIEW'06 Proceedings of the 1st first visual information expert conference on Pixelization paradigm
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

A heat-map type of chart for depicting large number of cases and up to twenty-five categorical variables with spreadsheet software is presented. It is implemented in Microsoft® Excel using standard formulas, sorting and simple VBA code. The motivating example depicts accuracy of automated assignment of MeSH® descriptor headings to abstracts of medical articles. Within each abstract, predicted support for each heading is ranked, then for each heading actually assigned/non-assigned by human specialist (depicted by black/white cell), high/low support is depicted on nine-point two-colour scale. Thus, each case (abstract) is depicted by one row of a table and each variable (heading) with two adjacent columns. Rank-based classification accuracy measure is calculated for each case, and rows are sorted in increasing accuracy order downwards. Based on analogous measure, variables are sorted in increasing prediction accuracy order rightwards. Another biomedical dataset is presented with a similar chart. Different methods for predicting binary outcomes can be visualised, and the procedure is easily extended to polytomous variables.